Affiliation:
1. University of California, Berkeley and Sandia National Laboratories
2. University of California, Berkeley
Abstract
The running time of an algorithm depends on both arithmetic and communication (i.e., data movement) costs, and the relative costs of communication are growing over time. In this work, we present sequential and distributed-memory parallel algorithms for tridiagonalizing full symmetric and symmetric band matrices that asymptotically reduce communication compared to previous approaches.
The tridiagonalization of a symmetric band matrix is a key kernel in solving the symmetric eigenvalue problem for both full and band matrices. In order to preserve structure, tridiagonalization routines use annihilate-and-chase procedures that previously have suffered from poor data locality and high parallel latency cost. We improve both by reorganizing the computation and obtain asymptotic improvements. We also propose new algorithms for reducing a full symmetric matrix to band form in a communication-efficient manner. In this article, we consider the cases of computing eigenvalues only and of computing eigenvalues and all eigenvectors.
Funder
Center for Future Architecture Research
Lockheed Martin Corporation
Sandia National Laboratories
US DOE
U.S. Department of Energy Contract
Microsoft
ParLab
DARPA
Math Works
NSF
Intel
STARnet
National Instruments
Sandia National Laboratories Truman Fellowship in National Security Science and Engineering
Samsung
UC Discovery
Nokia
NVIDIA
Sandia Corporation
Oracle
Semiconductor Research Corporation
MARCO
Publisher
Association for Computing Machinery (ACM)
Subject
Computational Theory and Mathematics,Computer Science Applications,Hardware and Architecture,Modeling and Simulation,Software
Reference55 articles.
1. The input/output complexity of sorting and related problems
2. Agullo E. Dongarra J. Hadri B. Kurzak J. Langou J. Langou J. Ltaief H. Luszczek P. and Yarkhan A. 2009. PLASMA Users' Guide. http://icl.cs.utk.edu/plasma/. Agullo E. Dongarra J. Hadri B. Kurzak J. Langou J. Langou J. Ltaief H. Luszczek P. and Yarkhan A. 2009. PLASMA Users' Guide . http://icl.cs.utk.edu/plasma/.
3. LAPACK Users' Guide
4. Parallel solution of partial symmetric eigenvalue problems from electronic structure calculations
5. Developing algorithms and software for the parallel solution of the symmetric eigenvalue problem
Cited by
9 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Generalized Ware-Amdhal Law;2024 32nd Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP);2024-03-20
2. Efficient parallel reduction of bandwidth for symmetric matrices;Parallel Computing;2023-02
3. Algorithm and Software Overhead: A Theoretical Approach to Performance Portability;Parallel Processing and Applied Mathematics;2023
4. High-performance sampling of generic determinantal point processes;Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences;2020-01-20
5. Improved Unconstrained Energy Functional Method for Eigensolvers in Electronic Structure Calculations;Proceedings of the 48th International Conference on Parallel Processing;2019-08-05