Affiliation:
1. Purdue University, West Lafayette, IN, USA
Abstract
B-trees are widely recognized as one of the most important index structures in database systems, providing efficient query processing capabilities. Over the past few decades, many techniques have been developed to enhance the efficiency of B-trees from various perspectives. Among them, B-tree compression is an important technique introduced as early as the 1970s to improve both space efficiency and query performance. Since then, several B-tree compression techniques have been developed. However, to our surprise, we have found that these B-tree compression techniques were never compared against each other in prior works. Consequently, many important questions remain unanswered, such as whether B-tree compression is truly effective or not. If it is effective, under what scenarios and which B-tree compression methods should be employed? In this paper, we conduct the first experimental evaluation of seven widely used B-tree compression techniques using both synthetic and real datasets. Based on our evaluation, we present lessons and insights that can be leveraged to guide system design decisions in modern databases regarding the use of B-tree compression.
Publisher
Association for Computing Machinery (ACM)
Reference52 articles.
1. 2007. WEBSPAM-UK2007 Dataset. https://chato.cl/webspam/datasets/uk2007/
2. 2008. SNAP Memetracker Dataset. https://www.kaggle.com/datasets/snap/snap-memetracker
3. 2022. The Default Page Size Change of SQLite 3.12.0. https://www.sqlite.org/pgszchng2016.html
4. 2022. Source Code of WiredTiger's B-Tree Implementation. https://github.com/wiredtiger/wiredtiger/tree/develop/ src/btree
5. 2023. CREATE INDEX Statement in SAP HANA (https://help.sap.com/docs/SAP_HANA_PLATFORM/ 4fe29514fd584807ac9f2a04f6754767/20d44b4175191014a940afff4b47c7ea.html).