Author:
Zhou Keren,Meng Xiaozhu,Sai Ryuichi,Mellor-Crummey John
Cited by
12 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Refining HPCToolkit for application performance analysis at exascale;The International Journal of High Performance Computing Applications;2024-08-30
2. Low-Overhead Trace Collection and Profiling on GPU Compute Kernels;ACM Transactions on Parallel Computing;2024-06-08
3. FASTEN: Fast GPU-accelerated Segmented Matrix Multiplication for Heterogenous Graph Neural Networks;Proceedings of the 38th ACM International Conference on Supercomputing;2024-05-30
4. Starlight: A kernel optimizer for GPU processing;Journal of Parallel and Distributed Computing;2024-05
5. GPUscout: Locating Data Movement-related Bottlenecks on GPUs;Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis;2023-11-12