Efficient tensor core-based GPU kernels for structured sparsity under reduced precision-Reference-Cited by-同舟云学术

Efficient tensor core-based GPU kernels for structured sparsity under reduced precision

Published:2021-11-13 Issue: Volume: Page:
ISSN:
Container-title:Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis
language:
Short-container-title:

Author:

Chen Zhaodong¹,Qu Zheng¹,Liu Liu¹,Ding Yufei¹,Xie Yuan¹

Affiliation:

1. University of California

Funder

National Science Foundation

Publisher

ACM

Link

https://dl.acm.org/doi/pdf/10.1145/3458817.3476182

Reference31 articles.

1. Martín Abadi Ashish Agarwal Paul Barham Eugene Brevdo Zhifeng Chen Craig Citro Greg S. Corrado Andy Davis Jeffrey Dean Matthieu Devin Sanjay Ghemawat Ian Goodfellow Andrew Harp Geoffrey Irving Michael Isard Yangqing Jia Rafal Jozefowicz Lukasz Kaiser Manjunath Kudlur Josh Levenberg Dandelion Mané Rajat Monga Sherry Moore Derek Murray Chris Olah Mike Schuster Jonathon Shlens Benoit Steiner Ilya Sutskever Kunal Talwar Paul Tucker Vincent Vanhoucke Vijay Vasudevan Fernanda Viégas Oriol Vinyals Pete Warden Martin Wattenberg Martin Wicke Yuan Yu and Xiaoqiang Zheng. 2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. https://www.tensorflow.org/ Software available from tensorflow.org. Martín Abadi Ashish Agarwal Paul Barham Eugene Brevdo Zhifeng Chen Craig Citro Greg S. Corrado Andy Davis Jeffrey Dean Matthieu Devin Sanjay Ghemawat Ian Goodfellow Andrew Harp Geoffrey Irving Michael Isard Yangqing Jia Rafal Jozefowicz Lukasz Kaiser Manjunath Kudlur Josh Levenberg Dandelion Mané Rajat Monga Sherry Moore Derek Murray Chris Olah Mike Schuster Jonathon Shlens Benoit Steiner Ilya Sutskever Kunal Talwar Paul Tucker Vincent Vanhoucke Vijay Vasudevan Fernanda Viégas Oriol Vinyals Pete Warden Martin Wattenberg Martin Wicke Yuan Yu and Xiaoqiang Zheng. 2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. https://www.tensorflow.org/ Software available from tensorflow.org.

Cited by 22 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Jigsaw: Accelerating SpMM with Vector Sparsity on Sparse Tensor Core;Proceedings of the 53rd International Conference on Parallel Processing;2024-08-12

2. Bitmap-Based Sparse Matrix-Vector Multiplication with Tensor Cores;Proceedings of the 53rd International Conference on Parallel Processing;2024-08-12

3. DTC-SpMM: Bridging the Gap in Accelerating General Sparse Matrix Multiplication with Tensor Cores;Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3;2024-04-27

4. EVT: Accelerating Deep Learning Training with Epilogue Visitor Tree;Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3;2024-04-27

5. Fractal: Joint Multi-Level Sparse Pattern Tuning of Accuracy and Performance for DNN Pruning;Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3;2024-04-27