Linear Systems Solvers for Distributed-Memory Machines with GPU Accelerators-Reference-Cited by-同舟云学术

Linear Systems Solvers for Distributed-Memory Machines with GPU Accelerators

Published:2019 Issue: Volume: Page:495-506
ISSN:0302-9743
Container-title:Lecture Notes in Computer Science
language:
Short-container-title:

Author:

Kurzak Jakub^ORCID,Gates Mark^ORCID,Charara Ali^ORCID,YarKhan Asim^ORCID,Yamazaki Ichitaro^ORCID,Dongarra Jack^ORCID

Publisher

Springer International Publishing

Link

http://link.springer.com/content/pdf/10.1007/978-3-030-29400-7_35

Reference15 articles.

1. Andersen, B.S., Gunnels, J.A., Gustavson, F., Wasniewski, J.: A recursive formulation of the inversion of symmetric positive definite matrices in packed storage data format. PARA 2, 287–296 (2002)

2. Andersen, B.S., Waśniewski, J., Gustavson, F.G.: A recursive formulation of Cholesky factorization of a matrix in packed storage. ACM Trans. Math. Softw. (TOMS) 27(2), 214–244 (2001)

3. Blackford, L.S., et al.: ScaLAPACK Users’ Guide. SIAM, Philadelphia (1997)

4. Castaldo, A., Whaley, C.: Scaling LAPACK panel operations using parallel cache assignment. In: ACM Sigplan Notices, vol. 45, pp. 223–232. ACM (2010)

5. Chan, E., van de Geijn, R., Chapman, A.: Managing the complexity of lookahead for LU factorization with pivoting. In: Proceedings of the Twenty-second Annual ACM Symposium on Parallelism in Algorithms and Architectures, pp. 200–208. ACM (2010)

Cited by 5 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Optimizing High-Performance Linpack for Exascale Accelerated Architectures;Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis;2023-11-11

2. Extending Hedgehog’s Dataflow Graphs to Multi-node GPU Architectures;Asynchronous Many-Task Systems and Applications;2023

3. Threshold Pivoting for Dense LU Factorization;2022 IEEE/ACM Workshop on Latest Advances in Scalable Algorithms for Large-Scale Heterogeneous Systems (ScalAH);2022-11

4. Optimizing the LINPACK Algorithm for Large-Scale PCIe-Based CPU-GPU Heterogeneous Systems;IEEE Transactions on Parallel and Distributed Systems;2021-09-01

5. Generic Matrix Multiplication for Multi-GPU Accelerated Distributed-Memory Platforms over PaRSEC;2019 IEEE/ACM 10th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA);2019-11