Affiliation:
1. University of California
2. Pacific Northwest National Laboratory
Funder
NSF (National Science Foundation)
U.S. DOE Office of Sci-ence, Office of Advanced Scientific Computing Research
Reference47 articles.
1. AMD. 2013. AMD Accelerated Parallel Processing OpenCL Programming Guide. http://developer.amd.com/wordpress/media/2013/07/AMD_Accelerated_Parallel_Processing_OpenCL_Programming_Guide-rev-2.7.pdf. AMD. 2013. AMD Accelerated Parallel Processing OpenCL Programming Guide. http://developer.amd.com/wordpress/media/2013/07/AMD_Accelerated_Parallel_Processing_OpenCL_Programming_Guide-rev-2.7.pdf.
2. Deep Learning with Low Precision by Half-Wave Gaussian Quantization
3. Xception: Deep Learning with Depthwise Separable Convolutions
4. NVIDIA A100 Tensor Core GPU: Performance and Innovation
5. Volta: Performance and Programmability
Cited by
20 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. On the Rise of AMD Matrix Cores: Performance, Power Efficiency, and Programmability;2024 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS);2024-05-05
2. ZENO: A Type-based Optimization Framework for Zero Knowledge Neural Network Inference;Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 1;2024-04-17
3. Revisit and Benchmarking of Automated Quantization Toward Fair Comparison;IEEE Transactions on Computers;2024-01
4. DASP: Specific Dense Matrix Multiply-Accumulate Units Accelerated General Sparse Matrix-Vector Multiplication;Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis;2023-11-11
5. BitGNN: Unleashing the Performance Potential of Binary Graph Neural Networks on GPUs;Proceedings of the 37th International Conference on Supercomputing;2023-06-21