Author:
Heinecke Alexander,Vaidyanathan Karthikeyan,Smelyanskiy Mikhail,Kobotov Alexander,Dubtsov Roman,Henry Greg,Shet Aniruddha G.,Chrysos George,Dubey Pradeep
Cited by
85 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. A novel HPL-AI approach for FP16-only accelerator and its instantiation on Kunpeng+Ascend AI-specific platform;Journal of Parallel and Distributed Computing;2024-08
2. Optimizing General Matrix Multiplications on Modern Multi-core DSPs;2024 IEEE International Parallel and Distributed Processing Symposium (IPDPS);2024-05-27
3. 5 ExaFlop/s HPL-MxP Benchmark with Linear Scalability on the 40-Million-Core Sunway Supercomputer;Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis;2023-11-11
4. Evolving the HPL benchmark towards multi-GPGPU clusters;CCF Transactions on High Performance Computing;2022-10-26
5. Seamless optimization of the GEMM kernel for task-based programming models;Proceedings of the 36th ACM International Conference on Supercomputing;2022-06-28