Affiliation:
1. School of Software Shandong University Jinan China
2. National Supercomputing Center in Wuxi Wuxi China
3. Department of Computer Science and Technology Tsinghua University Beijing China
4. Zhejiang Lab Hangzhou China
Abstract
AbstractHigh‐performance computing is progressively assuming a fundamental role in advancing scientific research and engineering domains. However, the ever‐expanding scales of scientific simulations pose challenges for efficient data I/O and storage. The data compression technology has garnered significant attention as a solution to reduce data transmission and storage costs while enhancing performance. In particular, the BZIP2 lossless compression algorithm has been widely used due to its exceptional compression ratio, moderate compression speed, high reliability, and open‐source nature. This paper focuses on the design and realization of a parallelized BZIP2 algorithm tailored for deployment on the New‐Generation Sunway supercomputing platform. By leveraging the unique cache patterns of the New‐Generation Sunway processor, we propose the highly tuned multi‐threading and multi‐node implementations of the BZIP2 applications for different scenarios. Moreover, we also propose the efficient BZIP2 libraries based on the management processing element and computing processing element which support the commonly used high‐level (de)compression interfaces. The test results indicate that the our multi‐threading implementation achieves maximum speedup of 23.09 (8.57) in decompression(compression) compared to the sequential implementation. Furthermore, the multi‐node implementation achieves 50.81% (26.35%) parallel efficiency and peak performance of 16.6 GB/s (52.8 GB/s) for compression(decompression) when scaling up to 2048 processes.
Funder
National Key Research and Development Program of China
Subject
General Engineering,General Computer Science
Reference25 articles.
1. DiS CappelloF.Fast error‐bounded lossy HPC data compression with SZ. Paper presented at: 2016 IEEE international parallel and distributed processing symposium (ipdps). IEEE.2016730‐739.
2. A review of data compression techniques;Fitriya LA;Int J Appl Eng Res,2017
3. Comparison of lossless data compression algorithms for text data;Kodituwakku SR;Indian J Comput Sci Eng,2010
4. LauferM FredjE.High Performance Parallel I/O and in‐Situ Analysis in the WRF Model with ADIOS2. arXiv preprint arXiv:2201.08228.2022.
5. Compression Techniques for DNA Sequences: A Thematic Review