Accelerating Sparse CNN Inference on GPUs with Performance-Aware Weight Pruning-Reference-Cited by-同舟云学术

Accelerating Sparse CNN Inference on GPUs with Performance-Aware Weight Pruning

Published:2020-09-30 Issue: Volume: Page:
ISSN:
Container-title:Proceedings of the ACM International Conference on Parallel Architectures and Compilation Techniques
language:
Short-container-title:

Author:

Rumi Masuma Akter¹,Ma Xiaolong²,Wang Yanzhi²,Jiang Peng¹

Affiliation:

1. The University of Iowa, Iowa City, IA, USA

2. Northeastern University, Boston, MA, USA

Publisher

ACM

Link

https://dl.acm.org/doi/pdf/10.1145/3410463.3414648

Reference47 articles.

1. 2007. hMETIS. http://glaros.dtc.umn.edu/gkhome/metis/hmetis/overview 2007. hMETIS. http://glaros.dtc.umn.edu/gkhome/metis/hmetis/overview

2. 2016. CUDA8 Performance Overview. http://developer.download.nvidia.com/ compute/cuda/compute-docs/cuda-performance-report.pdf 2016. CUDA8 Performance Overview. http://developer.download.nvidia.com/ compute/cuda/compute-docs/cuda-performance-report.pdf

3. 2019. The API reference guide for cuSPARSE the CUDA sparse matrix library. https://docs.nvidia.com/cuda/cusparse/index.html Version 10.1.168. 2019. The API reference guide for cuSPARSE the CUDA sparse matrix library. https://docs.nvidia.com/cuda/cusparse/index.html Version 10.1.168.

4. 2019. cuDNN Developer Guide. https://docs.nvidia.com/deeplearning/sdk/ cudnn-developer-guide/index.html. 2019. cuDNN Developer Guide. https://docs.nvidia.com/deeplearning/sdk/ cudnn-developer-guide/index.html.

5. 2019. MKLDNN. http://intel.github.io/mkl-dnn/ 2019. MKLDNN. http://intel.github.io/mkl-dnn/

Cited by 16 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Combining Weight Approximation, Sharing and Retraining for Neural Network Model Compression;ACM Transactions on Embedded Computing Systems;2024-09-11

2. Re-compact: Structured Pruning and SpMM Kernel Co-design for Accelerating DNNs on GPUs;2023 IEEE 41st International Conference on Computer Design (ICCD);2023-11-06

3. Estimating Redundancy-Reliability of CNNs Based on Strip-Median Attributes;IEEE Transactions on Very Large Scale Integration (VLSI) Systems;2023-10

4. Unified Data-Free Compression: Pruning and Quantization without Fine-Tuning;2023 IEEE/CVF International Conference on Computer Vision (ICCV);2023-10-01

5. cuSCNN : an Efficient CUDA Implementation of Sparse CNNs;Proceedings of the 13th International Symposium on Highly Efficient Accelerators and Reconfigurable Technologies;2023-06-14