HPS Cholesky: Hierarchical Parallelized Supernodal Cholesky with Adaptive Parameters


Lin Shengle1,Yang Wangdong1,Hu Yikun1,Cai Qinyun1,Dai Minlu1,Wang Haotian1,Li Kenli1


1. College of Computer Science and Electronic Engineering, Hunan University, China


Sparse supernodal Cholesky on multi-NUMAs is challenging due to the supernode relaxation and load balancing. In this work, we propose a novel approach to improve the performance of sparse Cholesky by combining deep learning with a relaxation parameter and a hierarchical parallelization strategy with NUMA affinity. Specifically, our relaxed supernodal algorithm utilizes a well-trained GCN model to adaptively adjust relaxation parameters based on the sparse matrix’s structure, achieving a proper balance between task-level parallelism and dense computational granularity. Additionally, the hierarchical parallelization maps supernodal tasks to the local NUMA parallel queue and updates contribution blocks in pipeline mode. Furthermore, the stream scheduling with NUMA affinity can further enhance the efficiency of memory access during the numerical factorization. The experimental results show that HPS Cholesky can outperform state-of-the-art libraries, such as Eigen LL T , CHOLMOD, PaStiX and SuiteSparse on \(79.78\% \) , \(79.60\% \) , \(82.09\% \) and \(74.47\% \) of 1128 datasets. It achieves an average speedup of 1.41x over the current optimal relaxation algorithm. Moreover, \(70.83\% \) of matrices have surpassed MKL sparse Cholesky on Xeon Gold 6248.


Association for Computing Machinery (ACM)


Computational Theory and Mathematics,Computer Science Applications,Hardware and Architecture,Modeling and Simulation,Software

Reference47 articles.

1. Saeid Abbasbandy , Ahmad Jafarian , and Reza Ezzati . 2005. Conjugate gradient method for fuzzy symmetric positive definite system of linear equations. Applied mathematics and computation 171, 2 ( 2005 ), 1184–1191. Saeid Abbasbandy, Ahmad Jafarian, and Reza Ezzati. 2005. Conjugate gradient method for fuzzy symmetric positive definite system of linear equations. Applied mathematics and computation 171, 2 (2005), 1184–1191.

2. The influence of relaxed supernode partitions on the multifrontal method

3. Cleve Ashcraft and Roger  G Grimes . 1999 . SPOOLES: An Object-Oriented Sparse Matrix Library.. In PPSC. Cleve Ashcraft and Roger G Grimes. 1999. SPOOLES: An Object-Oriented Sparse Matrix Library.. In PPSC.

4. An optimized computer implementation of incomplete Cholesky factorization

5. A Parallel Direct Solver for the Simulation of Large-Scale Power/Ground Networks








Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3