Extremely-Compressed SSDs with I/O Behavior Prediction-Reference-Cited by-同舟云学术

Extremely-Compressed SSDs with I/O Behavior Prediction

Published:2024-08-06 Issue:4 Volume:20 Page:1-38
ISSN:1553-3077
Container-title:ACM Transactions on Storage
language:en
Short-container-title:ACM Trans. Storage

Author:

Yao Xiangyu¹^ORCID,Li Qiao¹^ORCID,Lin Kaihuan¹^ORCID,Gan Xinbiao²^ORCID,Zhang Jie³^ORCID,Gao Congming¹^ORCID,Shen Zhirong¹^ORCID,Xu Quanqing⁴^ORCID,Yang Chuanhui⁴^ORCID,Xue Jason⁵^ORCID

Affiliation:

1. School of Informatics, Xiamen University, Xiamen, China

2. National University of Defense Technology, Changsha, China

3. The School of Computer Science, Peking University, Beijing, China

4. OceanBase, Ant Group, Hangzhou, China

5. Mohamed bin Zayed University of Artificial Intelligence, Masdar City, United Arab Emirates

Abstract

As the data volume continues to grow exponentially, there is an increasing demand for large storage system capacity. Data compression techniques effectively reduce the volume of written data, enhancing space efficiency. As a result, many modern SSDs have already incorporated data compression capabilities. However, data compression introduces additional processing overhead in critical I/O paths, potentially affecting system performance. Currently, most compression solutions in flash-based storage systems employ fixed compression algorithms for all incoming data without leveraging differences among various data access patterns. This leads to sub-optimal compression efficiency. This article proposes a data-type-aware Flash Translation Layer (DAFTL) scheme to maximize space efficiency without compromising system performance. First, we propose an I/O behavior prediction method to forecast future access on specific data. Then, DAFTL matches data types with distinct I/O behaviors to compression algorithms of varying intensities, achieving an optimal balance between performance and space efficiency. Specifically, it employs higher-intensity compression algorithms for less frequently accessed data to maximize space efficiency. For frequently accessed data, it utilizes lower-intensity but faster compression algorithms to maintain system performance. Finally, an improved compact compression method is proposed to effectively eliminate page fragmentation and further enhance space efficiency. Extensive evaluations using a variety of real-world workloads, as well as the workloads with real data we collected on our platforms, demonstrate that DAFTL achieves more data reductions than other approaches. When compared to the state-of-the-art compression schemes, DAFTL reduces the total number of pages written to the SSD by an average of 8%, 21.3%, and 25.6% for data with high, medium, and low compressibility, respectively. In the case of workloads with real data, DAFTL achieves an average reduction of 10.4% in the total number of pages written to SSD. Furthermore, DAFTL exhibits comparable or even improved read and write performance compared to other solutions.

Funder

National Key R&D Program of China

Natural Science Foundation of Xiamen

National Natural Science Foundation of China

China Fundamental Research Funds

Central Universities

Ant Group through CCF-Ant Research Fund

Publisher

Association for Computing Machinery (ACM)

Link

https://dl.acm.org/doi/pdf/10.1145/3677044

Reference64 articles.

1. Przemyslaw Skibinski Jinfei Han Dmitry Atamanov Andrea Bocci and Chip Turner. 2015. Lzbench. Retrieved 30 June 2024 from https://github.com/inikep/lzbench

2. Milosz Krajewski. 2015. Silesia Compression Corpus. Retrieved 30 June 2024 from https://github.com/MiloszKrajewski/SilesiaCorpus

3. Facebook. 2016. ZSTD. Retrieved 30 June 2024 from https://github.com/facebook/zstd

4. Chao Shi and Qiuping Wang. 2018. Alibaba Block Traces. Retrieved 30 June 2024 from https://github.com/alibaba/block-traces

5. Ohad Rodeh Josef Bacik and Chris Mason. 2013. BTRFS: The linux B-tree filesystem. ACM Transactions on Storage (TOS) 9 3 (2013) 1–32.