Tridigpu: A GPU Library for Block Tridiagonal and Banded Linear Equation Systems-Reference-Cited by-同舟云学术

Tridigpu: A GPU Library for Block Tridiagonal and Banded Linear Equation Systems

Published:2023-03-29 Issue:1 Volume:10 Page:1-33
ISSN:2329-4949
Container-title:ACM Transactions on Parallel Computing
language:en
Short-container-title:ACM Trans. Parallel Comput.

Author:

Klein Christoph¹^ORCID,Strzodka Robert¹^ORCID

Affiliation:

1. Institute of Computer Engineering (ZITI), Heidelberg, Germany

Abstract

In this article, we present a CUDA library with a C API for solving block cyclic tridiagonal and banded systems on one GPU. The library can process block tridiagonal systems with block sizes from 1 × 1 (scalar) to 4 × 4 and banded systems with up to four sub- and superdiagonals. For the compute-intensive block size cases and cases with many right-hand sides, we write out an explicit factorization to memory; however, for the scalar case, the fastest approach is to only output the coarse system and recompute the factorization. Prominent features of the library are (scaled) partial pivoting for improved numeric stability; highest-performance kernels, which completely utilize GPU memory bandwidth; and support for multiple sparse or dense right-hand side and solution vectors. The additional memory consumption is only 5% of the original tridiagonal system, which enables the solution of systems up to GPU memory size. The performance of the state-of-the-art scalar tridiagonal solver of cuSPARSE is outperformed by factor 5 for large problem sizes of 2 25 unknowns, on a GeForce RTX 2080 Ti.

Publisher

Association for Computing Machinery (ACM)

Subject

Computational Theory and Mathematics,Computer Science Applications,Hardware and Architecture,Modeling and Simulation,Software

Link

https://dl.acm.org/doi/pdf/10.1145/3580373

Reference41 articles.

1. Fast k-selection algorithms for graphics processing units

2. A Study on the Implementation of Tridiagonal Systems Solvers Using a GPU

3. Li-Wen Chang. 2014. Scalable Parallel Tridiagonal Algorithms with Diagonal Pivoting and Their Optimization for Many-Core Architectures. Master’s Thesis. University of Illinois at Urbana-Champaign.

4. A scalable, numerically stable, high-performance tridiagonal solver using GPUs