Adaptively restarted block Krylov subspace methods with low-synchronization skeletons-Reference-Cited by-同舟云学术

Adaptively restarted block Krylov subspace methods with low-synchronization skeletons

Published:2022-12-28 Issue: Volume: Page:
ISSN:1017-1398
Container-title:Numerical Algorithms
language:en
Short-container-title:Numer Algor

Author:

Lund Kathryn

Abstract

AbstractWith the recent realization of exascale performance by Oak Ridge National Laboratory’s Frontier supercomputer, reducing communication in kernels like QR factorization has become even more imperative. Low-synchronization Gram-Schmidt methods, first introduced in Świrydowicz et al. (Numer. Lin. Alg. Appl. 28(2):e2343, 2020), have been shown to improve the scalability of the Arnoldi method in high-performance distributed computing. Block versions of low-synchronization Gram-Schmidt show further potential for speeding up algorithms, as column-batching allows for maximizing cache usage with matrix-matrix operations. In this work, low-synchronization block Gram-Schmidt variants from Carson et al. (Linear Algebra Appl. 638:150–195, 2022) are transformed into block Arnoldi variants for use in block full orthogonalization methods (BFOM) and block generalized minimal residual methods (BGMRES). An adaptive restarting heuristic is developed to handle instabilities that arise with the increasing condition number of the Krylov basis. The performance, accuracy, and stability of these methods are assessed via a flexible benchmarking tool written in MATLAB. The modularity of the tool additionally permits generalized block inner products, like the global inner product.

Funder

Max Planck Institute for Dynamics of Complex Technical Systems (MPI Magdeburg)

Publisher

Springer Science and Business Media LLC

Subject

Applied Mathematics

Link

https://link.springer.com/content/pdf/10.1007/s11075-022-01437-1.pdf

Reference42 articles.

1. Barlow, J.L: Block modified Gram-Schmidt algorithms and their analysis. SIAM J. Matrix Anal. Appl 40(4), 1257–1290 (2019). https://doi.org/10.1137/18M1197400

2. Świrydowicz, K., Langou, J., Ananthan, S., Yang, U., Thomas, S.: Low synchronization Gram-Schmidt and generalized minimum residual algorithms. Numer. Lin. Alg. Appl., 28(2), https://doi.org/10.1002/nla.2343 (2020)

3. Yamazaki, I, Thomas, S., Hoemmen, M., Boman, E.G, Świrydowicz, K, Eilliot, J.J: Low-synchronization orthogonalization schemes for s-step and pipelined Krylov solvers in Trilinos. In: Proceedings of the 2020 SIAM conference on parallel processing for scientific computing (PP), pp. 118–128, https://doi.org/10.1137/1.9781611976137.11 (2020)

4. Thomas, S., Carson, E., Rozložník, M., Carr, A., Świrydowicz, K.: Iterated-gauss-seidel GMRES. arXiv:2205.07805v2 (2022)

5. Bielich, D., Langou, J., Thomas, S., Świrydowicz, K., Yamazaki, I., Boman, E.G.: Low-synch gram–schmidt with delayed reorthogonalization for krylov solvers. Parallel Comput. 112, 102940 (2022). https://doi.org/10.1016/j.parco.2022.102940

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Efficient GMRES+AMG on GPUs: Composite Smoothers And Mixed \(\boldsymbol{V}\)-Cycles;SIAM Journal on Scientific Computing;2024-09-03

2. A robust implicit high-order discontinuous Galerkin method for solving compressible Navier-Stokes equations on arbitrary grids;Acta Mechanica Sinica;2024-06-11

3. Matrix Pencil Optimal Iterative Algorithms and Restarted Versions for Linear Matrix Equation and Pseudoinverse;Mathematics;2024-06-05