Tsunami-Reference-Cited by-同舟云学术

Tsunami

Published:2020-10 Issue:2 Volume:14 Page:74-86
ISSN:2150-8097
Container-title:Proceedings of the VLDB Endowment
language:en
Short-container-title:Proc. VLDB Endow.

Author:

Ding Jialin¹,Nathan Vikram¹,Alizadeh Mohammad¹,Kraska Tim¹

Affiliation:

1. Massachusetts Insititute of Technology

Abstract

Filtering data based on predicates is one of the most fundamental operations for any modern data warehouse. Techniques to accelerate the execution of filter expressions include clustered indexes, specialized sort orders (e.g., Z-order), multi-dimensional indexes, and, for high selectivity queries, secondary indexes. However, these schemes are hard to tune and their performance is inconsistent. Recent work on learned multi-dimensional indexes has introduced the idea of automatically optimizing an index for a particular dataset and workload. However, the performance of that work suffers in the presence of correlated data and skewed query workloads, both of which are common in real applications. In this paper, we introduce Tsunami, which addresses these limitations to achieve up to 6X faster query performance and up to 8X smaller index size than existing learned multi-dimensional indexes, in addition to up to 11X faster query performance and 170X smaller index size than optimally-tuned traditional indexes.

Publisher

VLDB Endowment

Subject

General Earth and Planetary Sciences,Water Science and Technology,Geography, Planning and Development

Link

https://dl.acm.org/doi/pdf/10.14778/3425879.3425880

Cited by 70 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Revisiting Learned Index with Byte-addressable Persistent Storage;Proceedings of the 53rd International Conference on Parallel Processing;2024-08-12

2. A Survey of Multi-Dimensional Indexes: Past and Future Trends;IEEE Transactions on Knowledge and Data Engineering;2024-08

3. Blueprinting the Cloud: Unifying and Automatically Optimizing Cloud Data Infrastructures with BRAD;Proceedings of the VLDB Endowment;2024-07

4. Predicate Caching: Query-Driven Secondary Indexing for Cloud Data Warehouses;Companion of the 2024 International Conference on Management of Data;2024-06-09

5. Stage: Query Execution Time Prediction in Amazon Redshift;Companion of the 2024 International Conference on Management of Data;2024-06-09