Affiliation:
1. Department of Software, Dankook University, Yongin 16890, Republic of Korea
Abstract
The prevalence of big data has caused a notable surge in both the diversity and magnitude of data. Consequently, this has prompted the emergence and advancement of two distinct technologies: unstructured data management and data volume reduction. Key–value stores, such as Google’s LevelDB and Meta’s RocksDB, have emerged as a popular solution for managing unstructured data due to their ability to handle diverse data types with a simple key–value abstraction. Simultaneously, a multitude of data management tools have actively adopted compression techniques, such as Snappy and Zstd, to effectively reduce data volume. The objective of this study is to explore how these two technologies influence each other. For this purpose, we first examine a classification of compression techniques and discuss their strength and weakness, especially those adopted by modern key–value stores. We also investigate the internal structures and operations, such as batch writing and compaction, in order to grasp the characteristics of key–value stores. Then, we quantitatively evaluate the compression ratio and performance using RocksDB under diverse compression techniques, block sizes, value sizes, and workloads. Our evaluation shows that compression not only saves storage space but also decreases compaction overhead. It also reveals that compression techniques have their inherent trade-offs, meaning that some provide a better compression ratio, while others yield better compression performance. Based on our evaluation, a number of potential avenues for further research have been identified. These include the exploration of a compression-aware compaction mechanism, selective compression, and revisiting compression granularity.
Funder
Korea government
Institute of Information & communications Technology Planning & Evaluation(IITP) grant funded by the Korea governmen
Subject
Electrical and Electronic Engineering,Computer Networks and Communications,Hardware and Architecture,Signal Processing,Control and Systems Engineering
Reference55 articles.
1. Sayood, K. (2018). Introduction to Data Compression, Morgan Kaufmann. [5th ed.].
2. Salomon, D. (2007). Data Compression: The Complete Reference, Springer. [4th ed.].
3. A survey on data compression techniques: From the perspective of data quality, coding schemes, data type and applications;Jayasankar;J. King Saud Univ. Comput. Inf. Sci.,2021
4. Kleppmann, M. (2017). Designing Data-Intensive Applications: The Big Ideas behind Reliable, Scalable, and Maintainable Systems, O’Reilly Media, Inc.
5. Ramadhan, A.R., Choi, M., Chung, Y., and Choi, J. (2023). An Empirical Study of Segmented Linear Regression Search in LevelDB. Electronics, 12.
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献