Power-optimized Deployment of Key-value Stores Using Storage Class Memory

Author:

Kassa Hiwot Tadese1ORCID,Akers Jason2ORCID,Ghosh Mrinmoy2,Cao Zhichao2,Gogte Vaibhav1,Dreslinski Ronald1

Affiliation:

1. University of Michigan, Ann Arbor, MI, USA

2. Meta, Inc., Menlo Park, CA, USA

Abstract

High-performance flash-based key-value stores in data-centers utilize large amounts of DRAM to cache hot data. However, motivated by the high cost and power consumption of DRAM, server designs with lower DRAM-per-compute ratio are becoming popular. These low-cost servers enable scale-out services by reducing server workload densities. This results in improvements to overall service reliability, leading to a decrease in the total cost of ownership (TCO) for scalable workloads. Nevertheless, for key-value stores with large memory footprints, these reduced DRAM servers degrade performance due to an increase in both IO utilization and data access latency. In this scenario, a standard practice to improve performance for sharded databases is to reduce the number of shards per machine, which degrades the TCO benefits of reduced DRAM low-cost servers. In this work, we explore a practical solution to improve performance and reduce the costs and power consumption of key-value stores running on DRAM-constrained servers by using Storage Class Memories (SCM). SCMs in a DIMM form factor, although slower than DRAM, are sufficiently faster than flash when serving as a large extension to DRAM. With new technologies like Compute Express Link, we can expand the memory capacity of servers with high bandwidth and low latency connectivity with SCM. In this article, we use Intel Optane PMem 100 Series SCMs (DCPMM) in AppDirect mode to extend the available memory of our existing single-socket platform deployment of RocksDB (one of the largest key-value stores at Meta). We first designed a hybrid cache in RocksDB to harness both DRAM and SCM hierarchically. We then characterized the performance of the hybrid cache for three of the largest RocksDB use cases at Meta (ChatApp, BLOB Metadata, and Hive Cache). Our results demonstrate that we can achieve up to 80% improvement in throughput and 20% improvement in P95 latency over the existing small DRAM single-socket platform, while maintaining a 43–48% cost improvement over our large DRAM dual-socket platform. To the best of our knowledge, this is the first study of the DCPMM platform in a commercial data center.

Publisher

Association for Computing Machinery (ACM)

Subject

Hardware and Architecture

Reference89 articles.

1. CXL. 2022. Compute express link: The breakthrough CPU-to-device interconnect. Retrieved from https://www.computeexpresslink.org/.

2. Memkind. 2022. memkind library. Retrieved from https://github.com/memkind/memkind.

3. NDCTL. 2022. NDCTL and DAXCTL. Retrieved from https://github.com/pmem/ndctl.

4. NDCTL. 2022. NDCTL user guide: Managing namespaces. Retrieved from https://docs.pmem.io/ndctl-user-guide/managing-namespaces.

5. J. Paul Alcorn. 2019. Intel optane DIMM pricing. Retrieved from https://www.tomshardware.com/news/intel-optane-dimm-pricing-performance 39007.html.

Cited by 8 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. DiStore: A Fully Memory Disaggregation Friendly Key-Value Store with Improved Tail Latency and Space Efficiency;Proceedings of the 53rd International Conference on Parallel Processing;2024-08-12

2. Can Modern LLMs Tune and Configure LSM-based Key-Value Stores?;Proceedings of the 16th ACM Workshop on Hot Topics in Storage and File Systems;2024-07-08

3. Can ZNS SSDs be Better Storage Devices for Persistent Cache?;Proceedings of the 16th ACM Workshop on Hot Topics in Storage and File Systems;2024-07-08

4. CaaS-LSM: Compaction-as-a-Service for LSM-based Key-Value Stores in Storage Disaggregated Infrastructure;Proceedings of the ACM on Management of Data;2024-05-29

5. Research on High-Performance Framework of Big Data Acquisition, Storage and Application for Warfare Simulation;2024 IEEE 4th International Conference on Electronic Technology, Communication and Information (ICETCI);2024-05-24

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3