Affiliation:
1. University of Wisconsin
Abstract
Existing main memory data processing systems employ a variety of storage organizations and make a number of storage-related design choices. The focus of this paper is on systematically evaluating a number of these key storage design choices for main memory analytical (i.e. read-optimized) database settings. Our evaluation produces a number of key insights: First, it is always beneficial to organize data into self-contained memory blocks rather than large files. Second, both column-stores and row-stores display performance advantages for different types of queries, and for high performance both should be implemented as options for the tuple-storage layout. Third, cache-sensitive B+-tree indices can play a major role in accelerating query performance, especially when used in a block-oriented organization. Finally, compression can also play a role in accelerating query performance depending on data distribution and query selectivity.
Subject
General Earth and Planetary Sciences,Water Science and Technology,Geography, Planning and Development
Cited by
14 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Rethinking the Encoding of Integers for Scans on Skewed Data;Proceedings of the ACM on Management of Data;2023-12-08
2. On inter-operator data transfers in query processing;2022 IEEE 38th International Conference on Data Engineering (ICDE);2022-05
3. Qd-tree: Learning Data Layouts for Big Data Analytics;Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data;2020-06-11
4. Optimal column layout for hybrid workloads;Proceedings of the VLDB Endowment;2019-09
5. FishStore;Proceedings of the 2019 International Conference on Management of Data;2019-06-25