Affiliation:
1. Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China
Abstract
Learned indexes have been demonstrated to outperform traditional ones in memory-resident scenarios. However, recent studies show that they fail to outperform B+tree when extended to disks directly. In this paper, we argue that it is feasible to create efficient disk-based learned indexes by applying a set of general transformations and optimizations to existing in-memory ones. Through theoretical analysis and controlled experiments, we propose six transformation guidelines applicable to various state-of-the-art learned index structures to fully leverage the characteristics of disk storage. Our evaluation shows that the indexes developed by applying our guidelines achieve a Pareto improvement in both throughput and space efficiency compared to the traditional B+tree and previous implementations of disk-based learned indexes.
Publisher
Association for Computing Machinery (ACM)
Reference37 articles.
1. Integrating compression and execution in column-oriented database systems
2. Hussam Abu-Libdeh, Deniz Altinbüken, Alex Beutel, Ed H. Chi, Lyric Pankaj Doshi, Tim Klas Kraska, Xiaozhou (Steve) Li, Andy Ly, and Chris Olston (Eds.). 2020. Learned Indexes for a Google-scale Disk-based Database. https://arxiv.org/pdf/2012.12501.pdf
3. Symmetric binary B-Trees: Data structure and maintenance algorithms
4. Benchmarking cloud serving systems with YCSB
5. The Transaction Processing Council. 2007. TPC-C Benchmark. http://www.tpc.org/tpcc/