Author:
Abdelfattah Ahmad,Tomov Stan,Dongarra Jack
Publisher
Springer International Publishing
Reference21 articles.
1. LAPACK - Linear Algebra PACKage. http://www.netlib.org/lapack/
2. Abdelfattah, A., Haidar, A., Tomov, S., Dongarra, J.: Performance, design, and autotuning of batched GEMM for GPUs. In: ISC High Performance 2016, Frankfurt, Germany, 19–23 June 2016, Proceedings, pp. 21–38 (2016)
3. Abdelfattah, A., Haidar, A., Tomov, S., Dongarra, J.: Factorization and inversion of a million matrices using GPUs: challenges and countermeasures. Procedia Comput. Sci. 108, 606–615 (2017). ICCS 2017, Zurich, Switzerland
4. Abdelfattah, A., Haidar, A., Tomov, S., Dongarra, J.J.: Batched one-sided factorizations of tiny matrices using GPUs: challenges and countermeasures. J. Comput. Sci. 26, 226–236 (2018)
5. hipBLAS. https://github.com/ROCmSoftwarePlatform/hipBLAS
Cited by
6 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. MAGMA: Enabling exascale performance with accelerated BLAS and LAPACK for diverse GPU architectures;The International Journal of High Performance Computing Applications;2024-06-20
2. GPU-based LU Factorization and Solve on Batches of Matrices with Band Structure;Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis;2023-11-12
3. Analyzing the Implementation of the Newton Raphson Based Power Flow Formulation in CPU+GPU Computing Environment;2023 North American Power Symposium (NAPS);2023-10-15
4. eGPU: A 750 MHz Class Soft GPGPU for FPGA;2023 33rd International Conference on Field-Programmable Logic and Applications (FPL);2023-09-04
5. An Optimized Framework for Matrix Factorization on the New Sunway Many-core Platform;ACM Transactions on Architecture and Code Optimization;2023-03