Efficient-Grad: Efficient Training Deep Convolutional Neural Networks on Edge Devices with Grad ient Optimizations

Author:

Hong Ziyang1ORCID,Yue C. Patrick1ORCID

Affiliation:

1. HKUST-Qualcomm Joint Innovation and Research Laboratory, Hong Kong University of Science and Technology, Hong Kong SAR, China

Abstract

With the prospering of mobile devices, the distributed learning approach, enabling model training with decentralized data, has attracted great interest from researchers. However, the lack of training capability for edge devices significantly limits the energy efficiency of distributed learning in real life. This article describes Efficient-Grad, an algorithm-hardware co-design approach for training deep convolutional neural networks, which improves both throughput and energy saving during model training, with negligible validation accuracy loss. The key to Efficient-Grad is its exploitation of two observations. Firstly, the sparsity has potential for not only activation and weight, but gradients and the asymmetry residing in the gradients for the conventional back propagation (BP). Secondly, a dedicated hardware architecture for sparsity utilization and efficient data movement can be optimized to support the Efficient-Grad algorithm in a scalable manner. To the best of our knowledge, Efficient-Grad is the first approach that successfully adopts a feedback-alignment (FA)-based gradient optimization scheme for deep convolutional neural network training, which leads to its superiority in terms of energy efficiency. We present case studies to demonstrate that the Efficient-Grad design outperforms the prior arts by 3.72x in terms of energy efficiency.

Funder

Hong Kong Research Grants Council under General Research Fund

HKUST-Qualcomm Joint Innovation and Research Laboratory

Publisher

Association for Computing Machinery (ACM)

Subject

Hardware and Architecture,Software

Reference58 articles.

1. Chipyard: Integrated Design, Simulation, and Implementation Framework for Custom SoCs

2. Krste Asanović, Rimas Avizienis, Jonathan Bachrach, Scott Beamer, David Biancolin, Christopher Celio, Henry Cook, Daniel Dabbelt, John Hauser, Adam Izraelevitz, Sagar Karandikar, Ben Keller, Donggyu Kim, John Koenig, Yunsup Lee, Eric Love, Martin Maas, Albert Magyar, Howard Mao, Miquel Moreto, Albert Ou, David A. Patterson, Brian Richards, Colin Schmidt, Stephen Twigg, Huy Vo, and Andrew Waterman. 2016. The Rocket Chip Generator. Technical Report UCB/EECS-2016-17. EECS Department, University of California, Berkeley.

3. Chisel

4. Pierre Baldi, Peter Sadowski, and Zhiqin Lu. 2019. Learning in the machine: Random backpropagation and the deep learning channel. In Proceedings of the 28th International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization, 6348–6352.

5. Keith Bonawitz, Hubert Eichner, Wolfgang Grieskamp, Dzmitry Huba, Alex Ingerman, Vladimir Ivanov, Chloé Kiddon, Jakub Konečný, Stefano Mazzocchi, Brendan McMahan, Timon Van Overveldt, David Petrou, Daniel Ramage, and Jason Roselander. 2019. Towards federated learning at scale: System design. In Proceedings of Machine Learning and Systems, A. Talwalkar, V. Smith, and M. Zaharia (Eds.), Vol. 1. 374–388.

Cited by 4 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Evaluating the energy impact of device parameters for DNN inference on edge;Proceedings of the 14th International Green and Sustainable Computing Conference;2023-10-28

2. Cross-Dimensional Refined Learning for Real-Time 3D Visual Perception from Monocular Video;2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW);2023-10-02

3. Efficient On-device Transfer Learning using Activation Memory Reduction;2023 Eighth International Conference on Fog and Mobile Edge Computing (FMEC);2023-09-18

4. Efficient On-Device Training via Gradient Filtering;2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR);2023-06

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3