MirrorKV: An Efficient Key-Value Store on Hybrid Cloud Storage with Balanced Performance of Compaction and Querying

Author:

Wang Zhiqi1ORCID,Shao Zili1ORCID

Affiliation:

1. The Chinese University of Hong Kong, Hong Kong, Hong Kong

Abstract

LSM-based key-value stores have been leveraged in many state-of-the-art data-intensive applications as storage engines. As data volume scales up, a cost-efficient approach is to deploy these applications on hybrid cloud storage with hot/cold separation, which splits the LSM-tree into two parts and thus brings new challenges on how to split and how to close the significant performance gap between these two parts. Existing LSM-tree key-value stores mainly focus on the optimizations of local storage, which incurs sub-optimal performance when directly applied to hybrid storage. In this paper, we present MirrorKV for efficient compaction and querying on hybrid cloud storage. First, based on the capacities of fast and slow cloud storage, MirrorKV vertically separates hot/cold data of different levels stored in different cloud storage with different compaction mechanisms. To avoid compaction in slow storage being the bottleneck of the write path, MirrorKV proposes a novel virtual split to only compact the metadata during the compaction, which postpones the actual compaction until it reaches deep enough levels. Second, to reduce accessing slow storage during querying, MirrorKV horizontally separates keys and values into two mirrored LSM-trees to differentiate caching priorities; the maintained tree structures preserve the data locality for efficient sequential reading without incurring the overhead of the traditional key-value separation solutions. Finally, MirrorKV leverages cached data to guide the compaction where the hot data is retained in the fast storage while the cold data is compacted to deeper levels in slow storage. Compared with RocksDB-cloud, MirrorKV achieves 2.4× higher random insertion throughput, 29% higher random read throughput, and 99% less compaction time.

Funder

Direct Grant for Research, The Chinese University of Hong Kong

Research Grants Council of the Hong Kong Special Administrative Region, China

Publisher

Association for Computing Machinery (ACM)

Reference62 articles.

1. Building a database on S3

2. Helen H. W. Chan , Yongkun Li , Patrick P. C. Lee , and Yinlong Xu . 2018 . HashKV: Enabling Efficient Updates in KV Storage via Hashing . In 2018 USENIX Annual Technical Conference (USENIX ATC 18) . USENIX Association, Boston, MA, 1007--1019. https://www.usenix.org/conference/atc18/presentation/chan Helen H. W. Chan, Yongkun Li, Patrick P. C. Lee, and Yinlong Xu. 2018. HashKV: Enabling Efficient Updates in KV Storage via Hashing. In 2018 USENIX Annual Technical Conference (USENIX ATC 18). USENIX Association, Boston, MA, 1007--1019. https://www.usenix.org/conference/atc18/presentation/chan

3. Cosine

4. Hao Chen , Chaoyi Ruan , Cheng Li , Xiaosong Ma , and Yinlong Xu . 2021 . SpanDB: A Fast , Cost-Effective LSM-tree Based KV Store on Hybrid Storage. In 19th USENIX Conference on File and Storage Technologies (FAST 21) . USENIX Association, 17--32. https://www.usenix.org/conference/fast21/presentation/chen-hao Hao Chen, Chaoyi Ruan, Cheng Li, Xiaosong Ma, and Yinlong Xu. 2021. SpanDB: A Fast, Cost-Effective LSM-tree Based KV Store on Hybrid Storage. In 19th USENIX Conference on File and Storage Technologies (FAST 21). USENIX Association, 17--32. https://www.usenix.org/conference/fast21/presentation/chen-hao

5. cockroachdb 2022. CockroachDB. https://www.cockroachlabs.com/product/. cockroachdb 2022. CockroachDB. https://www.cockroachlabs.com/product/.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3