Author:
Mukunoki Daichi,Hirota Yusuke,Imamura Toshiyuki
Funder
Japan Society for the Promotion of Science (JSPS) KAKENHI
University of Tokyo
Reference18 articles.
1. OpenMP Application Programming Interface;OpenMP Archi-tecture Review Board,2020
2. Accelerating the SVD bi-diagonalization of a batch of small matrices using GPUs
3. Optimization of Numerous Small Dense-Matrix–Vector Multiplications in H-Matrix Arithmetic on GPU
4. High-Performance Matrix-Matrix Multiplications of Very Small Matrices
5. Hybrid Static/Dy-namic Scheduling for Already Optimized Dense Matrix Factorization;donfack;Proc IEEE 26th International Parallel and Distributed Processing Symposium (IPDPS 2012),0
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Task-aware Scheduling and Performance Optimization on Yitian710 SoC for GEMM-based Workloads on the Cloud;2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS);2023-06-11