Sparsity: Optimization Framework for Sparse Matrix Kernels

Author:

Im Eun-Jin1,Yelick Katherine,Vuduc Richard2

Affiliation:

1. SCHOOL OF COMPUTER SCIENCE KOOKMIN UNIVERSITY, SEOUL, KOREA

2. COMPUTER SCIENCE DIVISION UNIVERSITY OF CALIFORNIA, BERKELEY, CA, USA

Abstract

Sparse matrix–vector multiplication is an important computational kernel that performs poorly on most modern processors due to a low compute-to-memory ratio and irregular memory access patterns. Optimization is difficult because of the complexity of cache-based memory systems and because performance is highly dependent on the non-zero structure of the matrix. The SPARSITY system is designed to address these problems by allowing users to automatically build sparse matrix kernels that are tuned to their matrices and machines. SPARSITY combines traditional techniques such as loop transformations with data structure transformations and optimization heuristics that are specific to sparse matrices. It provides a novel framework for selecting optimization parameters, such as block size, using a combination of performance models and search. In this paper we discuss the optimization of two operations: a sparse matrix times a dense vector and a sparse matrix times a set of dense vectors. Our experience indicates that register level optimizations are effective for matrices arising in certain scientific simulations, in particular finite-element problems. Cache level optimizations are important when the vector used in multiplication is larger than the cache size, especially for matrices in which the non-zero structure is random. For applications involving multiple vectors, reorganizing the computation to perform the entire set of multiplications as a single operation produces significant speedups. We describe the different optimizations and parameter selection techniques and evaluate them on several machines using over 40 matrices taken from a broad set of application domains. Our results demonstrate speedups of up to 4× for the single vector case and up to 10× for the multiple vector case.

Publisher

SAGE Publications

Subject

Hardware and Architecture,Theoretical Computer Science,Software

Cited by 189 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. SpChar: Characterizing the sparse puzzle via decision trees;Journal of Parallel and Distributed Computing;2024-10

2. Dedicated Hardware Accelerators for Processing of Sparse Matrices and Vectors: A Survey;ACM Transactions on Architecture and Code Optimization;2024-02-15

3. Sparse Matrix-Vector Product for the bmSparse Matrix Format in GPUs;Lecture Notes in Computer Science;2024

4. SPC5: An efficient SpMV framework vectorized using ARM SVE and x86 AVX-512;Computer Science and Information Systems;2024

5. Efficiently Running SpMV on Multi-Core DSPs for Block Sparse Matrix;2023 IEEE 29th International Conference on Parallel and Distributed Systems (ICPADS);2023-12-17

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3