Cascading structured pruning

Author:

Hanson Edward1,Li Shiyu1,Li Hai 'Helen'1,Chen Yiran1

Affiliation:

1. Duke University

Funder

NSF

ARO

Publisher

ACM

Reference41 articles.

1. J. Albericio , A. Delmas , P. Judd , S. Sharify , G. O'Leary , R. Genov , and A. Moshovos , " Bit-pragmatic deep neural network computing," in Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture , MICRO 2017 , Cambridge, MA, USA, October 14--18 , 2017 . ACM, 2017, pp. 382 -- 394 . J. Albericio, A. Delmas, P. Judd, S. Sharify, G. O'Leary, R. Genov, and A. Moshovos, "Bit-pragmatic deep neural network computing," in Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2017, Cambridge, MA, USA, October 14--18, 2017. ACM, 2017, pp. 382--394.

2. J. Albericio , P. Judd , T. Hetherington , T. Aamodt , N. E. Jerger , and A. Moshovos , " Cnvlutin: Ineffectual-neuron-free deep neural network computing," in 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA) , 2016 , pp. 1 -- 13 . J. Albericio, P. Judd, T. Hetherington, T. Aamodt, N. E. Jerger, and A. Moshovos, "Cnvlutin: Ineffectual-neuron-free deep neural network computing," in 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA), 2016, pp. 1--13.

3. DianNao: A small-footprint high-throughput accelerator for ubiquitous machine-learning," in Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems, ser. ASPLOS '14. New York, NY;Chen T.;USA: Association for Computing Machinery,2014

4. R. Cheong and R. Daniel , " transformers. zip: Compressing transformers with pruning and quantization," Technical report , Stanford University , 2019 . R. Cheong and R. Daniel, "transformers. zip: Compressing transformers with pruning and quantization," Technical report, Stanford University, 2019.

5. C. Deng , S. Liao , Y. Xie , K. K. Parhi , X. Qian , and B. Yuan , " PermDNN: Efficient compressed DNN architecture with permuted diagonal matrices," in Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, ser. MICRO-51 . IEEE Press , 2018 , p. 189--202. C. Deng, S. Liao, Y. Xie, K. K. Parhi, X. Qian, and B. Yuan, "PermDNN: Efficient compressed DNN architecture with permuted diagonal matrices," in Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, ser. MICRO-51. IEEE Press, 2018, p. 189--202.

Cited by 14 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Workload-Balanced Pruning for Sparse Spiking Neural Networks;IEEE Transactions on Emerging Topics in Computational Intelligence;2024-08

2. CRPIM: An efficient compute-reuse scheme for ReRAM-based Processing-in-Memory DNN accelerators;Journal of Systems Architecture;2024-08

3. RCW-Pruner: Row-Column Wise Pruning Framework on Systolic Array;2024 10th IEEE International Conference on High Performance and Smart Computing (HPSC);2024-05-10

4. SparGNN: Efficient Joint Feature-Model Sparsity Exploitation in Graph Neural Network Acceleration;2024 29th Asia and South Pacific Design Automation Conference (ASP-DAC);2024-01-22

5. SAC: An Ultra-Efficient Spin-based Architecture for Compressed DNNs;ACM Transactions on Architecture and Code Optimization;2024-01-19

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3