Affiliation:
1. The Chinese University of Hong Kong, Shenzhen, Shenzhen, Guangdong, China
2. HUAWEI CLOUD, Shenzhen, Guangdong, China
Abstract
Sparse matrices are often used to model the interactions among different objects and they are prevalent in many areas including e-commerce, social network, and biology. As one of the fundamental matrix operations, the sparse matrix chain multiplication (SMCM) aims to efficiently multiply a chain of sparse matrices, which has found various real-world applications in areas like network analysis, data mining, and machine learning. The efficiency of SMCM largely hinges on the order of multiplying the matrices, which further relies on the accurate estimation of the sparsity values of intermediate matrices. Existing matrix sparsity estimators often struggle with large sparse matrices, because they suffer from the accuracy issue in both theory and practice. To enable efficient SMCM, in this paper we introduce a novel row-wise sparsity estimator (RS-estimator), a straightforward yet effective estimator that leverages matrix structural properties to achieve efficient, accurate, and theoretically guaranteed sparsity estimation. Based on the RS-estimator, we propose a novel ordering algorithm for determining a good order of efficient SMCM. We further develop an efficient parallel SMCM algorithm by effectively utilizing multiple CPU threads. We have conducted experiments by multiplying various chains of large sparse matrices extracted from five real-world large graph datasets, and the results demonstrate the effectiveness and efficiency of our proposed methods. In particular, our SMCM algorithm is up to three orders of magnitude faster than the state-of-the-art algorithms.
Funder
NSFC
Basic and Applied Basic Research Fund in Guangdong Province
Guangdong Talent Program
Publisher
Association for Computing Machinery (ACM)
Reference67 articles.
1. Communication optimal parallel multiplication of sparse random matrices
2. The generalized matrix chain algorithm
3. Girish Biswas and Nandini Mukherjee. 2021. Memory Optimized Dynamic Matrix Chain Multiplication Using Shared Memory in GPU. In International Conference on Distributed Computing and Internet Technology. 160--172.
4. SystemML's Optimizer: Plan Generation for Large-Scale Machine Learning Programs;Boehm Matthias;IEEE Data Eng. Bull.,2014
5. Challenges and Advances in Parallel Sparse Matrix-Matrix Multiplication