SAC: An Ultra-Efficient Spin-based Architecture for Compressed DNNs

Author:

Zhao Yunping1ORCID,Ma Sheng2ORCID,Liu Heng3ORCID,Huang Libo3ORCID,Dai Yi3ORCID

Affiliation:

1. Institute of Microelectronics and Microprocessors, School of Computy

2. The Science and Technology on Parallel and Distributed Processing Laboratory, National University of Defense Technology

3. School of Computer, National University of Defense Technology

Abstract

Deep Neural Networks (DNNs) have achieved great progress in academia and industry. But they have become computational and memory intensive with the increase of network depth. Previous designs seek breakthroughs in software and hardware levels to mitigate these challenges. At the software level, neural network compression techniques have effectively reduced network scale and energy consumption. However, the conventional compression algorithm is complex and energy intensive. At the hardware level, the improvements in the semiconductor process have effectively reduced power and energy consumption. However, it is difficult for the traditional Von-Neumann architecture to further reduce the power consumption, due to the memory wall and the end of Moore’s law. To overcome these challenges, the spintronic device based DNN machines have emerged for their non-volatility, ultra low power, and high energy efficiency. However, there is no spin-based design that has achieved innovation at both the software and hardware level. Specifically, there is no systematic study of spin-based DNN architecture to deploy compressed networks. In our study, we present an ultra-efficient Spin-based Architecture for Compressed DNNs (SAC), to substantially reduce power consumption and energy consumption. Specifically, we propose a One-Step Compression algorithm (OSC) to reduce the computational complexity with minimum accuracy loss. We also propose a spin-based architecture to realize better performance for the compressed network. Furthermore, we introduce a novel computation flow that enables the reuse of activations and weights. Experimental results show that our study can reduce the computational complexity of compression algorithm from 𝒪( Tk 3 to 𝒪( k 2 log k ), and achieve 14× ∼ 40× compression ratio. Furthermore, our design can attain a 2× enhancement in power efficiency and a 5× improvement in computational efficiency compared to the Eyeriss. Our models are available at an anonymous link https://bit.ly/39cdtTa .

Funder

National Key R&D

NSFC

NSF of Hunan Province

STIP of Hunan Province

Key Laboratory of Advanced Microprocessor Chips and Systems

Hunan Postgraduate Research Innovation

Publisher

Association for Computing Machinery (ACM)

Subject

Hardware and Architecture,Information Systems,Software

Reference75 articles.

1. Martín Abadi. 2016. TensorFlow: Learning functions at scale. In Proceedings of the 21st ACM SIGPLAN International Conference on Functional Programming. 1–1.

2. Md. Hasibul Amin, Mohammed Elbtity, Mohammadreza Mohammadi, and Ramtin Zand. 2022. MRAM-based analog sigmoid function for in-memory computing. In Proceedings of the Great Lakes Symposium on VLSI 2022. 319–323.

3. Aayush Ankit, Abhronil Sengupta, Priyadarshini Panda, and Kaushik Roy. 2017. RESPARC: A reconfigurable and energy-efficient architecture with memristive crossbars for deep spiking neural networks. In 2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC’17). 1–6.

4. Neuromorphic computing using non-volatile memory;Burr Geoffrey W.;Advances in Physics: X,2017

5. Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks;Chen Yu-Hsin;IEEE Journal of Solid-state Circuits,2016

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3