Funder
National Science Foundation Directorate for Computer and Information Science and Engineering
Subject
Artificial Intelligence,Computer Networks and Communications,Hardware and Architecture,Theoretical Computer Science,Software
Reference38 articles.
1. Kblas: An optimized library for dense matrix-vector multiplication on gpu accelerators;Abdelfattah;ACM Trans. Math. Softw. (TOMS),2016
2. Basic Linear Algebra on NVIDIA GPUs, https://developer.nvidia.com/cublas.
3. Fault Tolerant and Energy Efficient One-Sided Matrix Decompositions on Heterogeneous Systems with GPUs;Chen,2019
4. J. Chen, S. Li, Z. Chen, GPU-ABFT: Optimizing algorithm-based fault tolerance for heterogeneous systems with GPUs, in: 2016 IEEE International Conference on Networking, Architecture and Storage (NAS).
5. Fault tolerant one-sided matrix decompositions on heterogeneous systems with GPUs;Chen,2018
Cited by
15 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Detailed Analysis and Optimization of Irregular-Shaped Matrix Multiplication on Multi-Core DSPs;Proceedings of the 53rd International Conference on Parallel Processing;2024-08-12
2. Optimizing General Matrix Multiplications on Modern Multi-core DSPs;2024 IEEE International Parallel and Distributed Processing Symposium (IPDPS);2024-05-27
3. FlexGEMM: A Flexible Micro-kernel Generation Framework;Proceedings of the 5th International Conference on Computer Information and Big Data Applications;2024-04-26
4. Optimizing Full-Spectrum Matrix Multiplications on ARMv8 Multi-Core CPUs;IEEE Transactions on Parallel and Distributed Systems;2024-03
5. Fast Kronecker Matrix-Matrix Multiplication on GPUs;Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming;2024-02-20