Affiliation:
1. Zhijiang College, Zhejiang University of Technology, Hangzhou 310024, China
2. College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou 310023, China
Abstract
Sparse matrix-vector multiplication (SpMV) is an important operation in scientific computations. Compressed sparse row (CSR) is the most frequently used format to store sparse matrices. However, CSR-based SpMVs on graphic processing units (GPUs), for example, CSR-scalar and CSR-vector, usually have poor performance due to irregular memory access patterns. This motivates us to propose a perfect CSR-based SpMV on the GPU that is called PCSR. PCSR involves two kernels and accesses CSR arrays in a fully coalesced manner by introducing a middle array, which greatly alleviates the deficiencies of CSR-scalar (rare coalescing) and CSR-vector (partial coalescing). Test results on a single C2050 GPU show that PCSR fully outperforms CSR-scalar, CSR-vector, and CSRMV and HYBMV in the vendor-tuned CUSPARSE library and is comparable with a most recently proposed CSR-based algorithm, CSR-Adaptive. Furthermore, we extend PCSR on a single GPU to multiple GPUs. Experimental results on four C2050 GPUs show that no matter whether the communication between GPUs is considered or not PCSR on multiple GPUs achieves good performance and has high parallel efficiency.
Funder
National Natural Science Foundation of China
Subject
General Engineering,General Mathematics
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Efficient Algorithm Design of Optimizing SpMV on GPU;Proceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing;2023-08-07
2. Mixed and Multi-Precision SpMV for GPUs with Row-wise Precision Selection;2022 IEEE 34th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD);2022-11
3. Adaptive diagonal sparse matrix-vector multiplication on GPU;Journal of Parallel and Distributed Computing;2021-11
4. Efficient CSR-Based Sparse Matrix-Vector Multiplication on GPU;Mathematical Problems in Engineering;2016