Affiliation:
1. College of Computer Science and Electronic Engineering, Hunan University, China
Abstract
Sparse supernodal Cholesky on multi-NUMAs is challenging due to the supernode relaxation and load balancing. In this work, we propose a novel approach to improve the performance of sparse Cholesky by combining deep learning with a relaxation parameter and a hierarchical parallelization strategy with NUMA affinity. Specifically, our relaxed supernodal algorithm utilizes a well-trained GCN model to adaptively adjust relaxation parameters based on the sparse matrix’s structure, achieving a proper balance between task-level parallelism and dense computational granularity. Additionally, the hierarchical parallelization maps supernodal tasks to the local NUMA parallel queue and updates contribution blocks in pipeline mode. Furthermore, the stream scheduling with NUMA affinity can further enhance the efficiency of memory access during the numerical factorization. The experimental results show that HPS Cholesky can outperform state-of-the-art libraries, such as Eigen
LL
T
, CHOLMOD, PaStiX and SuiteSparse on
\(79.78\% \)
,
\(79.60\% \)
,
\(82.09\% \)
and
\(74.47\% \)
of 1128 datasets. It achieves an average speedup of 1.41x over the current optimal relaxation algorithm. Moreover,
\(70.83\% \)
of matrices have surpassed MKL sparse Cholesky on Xeon Gold 6248.
Publisher
Association for Computing Machinery (ACM)
Subject
Computational Theory and Mathematics,Computer Science Applications,Hardware and Architecture,Modeling and Simulation,Software
Reference47 articles.
1. Saeid Abbasbandy , Ahmad Jafarian , and Reza Ezzati . 2005. Conjugate gradient method for fuzzy symmetric positive definite system of linear equations. Applied mathematics and computation 171, 2 ( 2005 ), 1184–1191. Saeid Abbasbandy, Ahmad Jafarian, and Reza Ezzati. 2005. Conjugate gradient method for fuzzy symmetric positive definite system of linear equations. Applied mathematics and computation 171, 2 (2005), 1184–1191.
2. The influence of relaxed supernode partitions on the multifrontal method
3. Cleve Ashcraft and Roger G Grimes . 1999 . SPOOLES: An Object-Oriented Sparse Matrix Library.. In PPSC. Cleve Ashcraft and Roger G Grimes. 1999. SPOOLES: An Object-Oriented Sparse Matrix Library.. In PPSC.
4. An optimized computer implementation of incomplete Cholesky factorization
5. A Parallel Direct Solver for the Simulation of Large-Scale Power/Ground Networks