MirrorKV: An Efficient Key-Value Store on Hybrid Cloud Storage with Balanced Performance of Compaction and Querying-Reference-Cited by-同舟云学术

MirrorKV: An Efficient Key-Value Store on Hybrid Cloud Storage with Balanced Performance of Compaction and Querying

Published:2023-12-08 Issue:4 Volume:1 Page:1-27
ISSN:2836-6573
Container-title:Proceedings of the ACM on Management of Data
language:en
Short-container-title:Proc. ACM Manag. Data

Author:

Wang Zhiqi¹^ORCID,Shao Zili¹^ORCID

Affiliation:

1. The Chinese University of Hong Kong, Hong Kong, Hong Kong

Abstract

LSM-based key-value stores have been leveraged in many state-of-the-art data-intensive applications as storage engines. As data volume scales up, a cost-efficient approach is to deploy these applications on hybrid cloud storage with hot/cold separation, which splits the LSM-tree into two parts and thus brings new challenges on how to split and how to close the significant performance gap between these two parts. Existing LSM-tree key-value stores mainly focus on the optimizations of local storage, which incurs sub-optimal performance when directly applied to hybrid storage. In this paper, we present MirrorKV for efficient compaction and querying on hybrid cloud storage. First, based on the capacities of fast and slow cloud storage, MirrorKV vertically separates hot/cold data of different levels stored in different cloud storage with different compaction mechanisms. To avoid compaction in slow storage being the bottleneck of the write path, MirrorKV proposes a novel virtual split to only compact the metadata during the compaction, which postpones the actual compaction until it reaches deep enough levels. Second, to reduce accessing slow storage during querying, MirrorKV horizontally separates keys and values into two mirrored LSM-trees to differentiate caching priorities; the maintained tree structures preserve the data locality for efficient sequential reading without incurring the overhead of the traditional key-value separation solutions. Finally, MirrorKV leverages cached data to guide the compaction where the hot data is retained in the fast storage while the cold data is compacted to deeper levels in slow storage. Compared with RocksDB-cloud, MirrorKV achieves 2.4× higher random insertion throughput, 29% higher random read throughput, and 99% less compaction time.

Funder

Direct Grant for Research, The Chinese University of Hong Kong

Research Grants Council of the Hong Kong Special Administrative Region, China

Publisher

Association for Computing Machinery (ACM)

Link

https://dl.acm.org/doi/pdf/10.1145/3626736

Reference62 articles.

1. Building a database on S3

2. Helen H. W. Chan , Yongkun Li , Patrick P. C. Lee , and Yinlong Xu . 2018 . HashKV: Enabling Efficient Updates in KV Storage via Hashing . In 2018 USENIX Annual Technical Conference (USENIX ATC 18) . USENIX Association, Boston, MA, 1007--1019. https://www.usenix.org/conference/atc18/presentation/chan Helen H. W. Chan, Yongkun Li, Patrick P. C. Lee, and Yinlong Xu. 2018. HashKV: Enabling Efficient Updates in KV Storage via Hashing. In 2018 USENIX Annual Technical Conference (USENIX ATC 18). USENIX Association, Boston, MA, 1007--1019. https://www.usenix.org/conference/atc18/presentation/chan

3. Cosine

4. Hao Chen , Chaoyi Ruan , Cheng Li , Xiaosong Ma , and Yinlong Xu . 2021 . SpanDB: A Fast , Cost-Effective LSM-tree Based KV Store on Hybrid Storage. In 19th USENIX Conference on File and Storage Technologies (FAST 21) . USENIX Association, 17--32. https://www.usenix.org/conference/fast21/presentation/chen-hao Hao Chen, Chaoyi Ruan, Cheng Li, Xiaosong Ma, and Yinlong Xu. 2021. SpanDB: A Fast, Cost-Effective LSM-tree Based KV Store on Hybrid Storage. In 19th USENIX Conference on File and Storage Technologies (FAST 21). USENIX Association, 17--32. https://www.usenix.org/conference/fast21/presentation/chen-hao

5. cockroachdb 2022. CockroachDB. https://www.cockroachlabs.com/product/. cockroachdb 2022. CockroachDB. https://www.cockroachlabs.com/product/.

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Accelerating Native Transaction Processing in LSM-Based Persistent Key-Value Stores;2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW);2024-05-27