*** This is the session I have done for Sri Lanka Data Community – Feb 2022 episode. This does not contain the entire event.
With modern data solutions, where we see data warehousing and big data, analytics are not limited to columnar based queries, highly selective queries are part of the trend now. Here is the way of optimizing such queries with Hyperspace: An indexing subsystem for Apache Spark.
Read more at:
https://docs.microsoft.com/en-us/azure/synapse-analytics/spark/apache-spark-performance-hyperspace?pivots=programming-language-csharp/?WT.mc_id=DP-MVP-33296
This talks about
- What is Hyperspace and how it helps us to add indexes
- How to see indexes, what changes it makes to the data lake
- How to check whether indexes are used for queries
- How to compare indexed query and non-indexed query with given APIs
If you have any questions, please add to the comment section.
No comments:
Post a Comment