A high-performance parallel algorithm for nonnegative matrix factorization-Reference-Cited by-同舟云学术

A high-performance parallel algorithm for nonnegative matrix factorization

Published:2016-11-09 Issue:8 Volume:51 Page:1-11
ISSN:0362-1340
Container-title:ACM SIGPLAN Notices
language:en
Short-container-title:SIGPLAN Not.

Author:

Kannan Ramakrishnan¹,Ballard Grey²,Park Haesun¹

Affiliation:

1. Georgia Tech

2. Sandia National Laboratories

Abstract

Non-negative matrix factorization (NMF) is the problem of determining two non-negative low rank factors W and H , for the given input matrix A , such that A ≈ WH . NMF is a useful tool for many applications in different domains such as topic modeling in text mining, background separation in video analysis, and community detection in social networks. Despite its popularity in the data mining community, there is a lack of efficient distributed algorithms to solve the problem for big data sets. We propose a high-performance distributed-memory parallel algorithm that computes the factorization by iteratively solving alternating non-negative least squares (NLS) subproblems for W and H . It maintains the data and factor matrices in memory (distributed across processors), uses MPI for interprocessor communication, and, in the dense case, provably minimizes communication costs (under mild assumptions). As opposed to previous implementations, our algorithm is also flexible: (1) it performs well for both dense and sparse matrices, and (2) it allows the user to choose any one of the multiple algorithms for solving the updates to low rank factors W and H within the alternating iterations. We demonstrate the scalability of our algorithm and compare it with baseline implementations, showing significant performance improvements.

Funder

U.S. Department of Energy

Air Force Office of Scientific Research

Defense Advanced Research Projects Agency

National Science Foundation

Publisher

Association for Computing Machinery (ACM)

Subject

Computer Graphics and Computer-Aided Design,Software

Link

https://dl.acm.org/doi/pdf/10.1145/3016078.2851152

Reference30 articles.

1. Hypergraph Partitioning for Parallel Sparse Matrix-Matrix Multiplication

2. Nonnegative Matrix and Tensor Factorizations

3. Communication-Optimal Parallel Recursive Rectangular Matrix Multiplication

4. Behavioral clusters in dynamic graphs

Cited by 30 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Mentor: A Memory-Efficient Sparse-dense Matrix Multiplication Accelerator Based on Column-Wise Product;ACM Transactions on Architecture and Code Optimization;2024-08-26

2. DTC-SpMM: Bridging the Gap in Accelerating General Sparse Matrix Multiplication with Tensor Cores;Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3;2024-04-27

3. The cycling and aging mouse female reproductive tract at single-cell resolution;Cell;2024-02

4. Overlapping community detection using expansion with contraction;Neurocomputing;2024-01

5. NMTF-LTM: Towards an Alignment of Semantics for Lifelong Topic Modeling;IEEE Transactions on Knowledge and Data Engineering;2023-10-01