Hardware-aware Quantization/Mapping Strategies for Compute-in-Memory Accelerators

Author:

Huang Shanshi1ORCID,Jiang Hongwu1ORCID,Yu Shimeng1ORCID

Affiliation:

1. Georgia Institute of Technology, Atlanta, GA, US

Abstract

The emerging non-volatile memory (eNVM) based mixed-signal Compute-in-Memory (CIM) accelerators are of great interest in today's AI accelerators design due to their high energy efficiency. Various CIM architectures and circuit-level designs have been proposed, showing superior hardware performance for deep neural network (DNN) acceleration. However, hardware-aware quantization strategies for CIM-based accelerators are not systematically explored. Since there are a variety of design options for neural network mapping on CIM systems while improper strategies may narrow the circuit-level design space and further limit the hardware performance, it is important to make a comprehensive early-stage design space exploration and find appropriate quantization/mapping strategies to achieve better hardware performance. In this paper, we provide a joint algorithm-hardware analysis and compare the system-level hardware performance for various design options, including quantization algorithms, data representation methods and analog-to-digital converter (ADC) configurations. This work aims to propose guidelines for choosing more hardware-friendly design options for chip architects. According to our evaluation results for CIFAR-10/100 and ImageNet classification, the properly chosen quantization approach and optimal mapping strategy (dynamic fixed-point quantization + 2’s complement representation/shifted unsigned INT representation + optimized precision ADC) could achieve ∼2 × energy efficiency and 1.2 ∼ 1.6 × throughput with 5%∼25% reduced area overhead, compared to naïve strategy (fixed-point quantization + differential pair number representation + full precision ADC).

Funder

ASCENT, one of the SRC/DARPA JUMP Centers

Publisher

Association for Computing Machinery (ACM)

Subject

Electrical and Electronic Engineering,Computer Graphics and Computer-Aided Design,Computer Science Applications

Reference37 articles.

1. Neuro-Inspired Computing With Emerging Nonvolatile Memorys

2. A methodology to improve linearity of analog RRAM for neuromorphic computing;Wu W.;IEEE Symposium on VLSI Technology (VLSI’18),2018

3. Ferroelectric FET analog synapse for acceleration of deep neural network training;Jerry M.;IEEE International Electron Devices Meeting (IEDM’17),2017

4. ISAAC: A convolutional neural network accelerator with in-situ analog arithmetic in crossbars;Shafiee Ali;ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA),2016

5. PRIME: A novel processing-in-memory architecture for neural network;Chi Ping;International Symposium on Computer Architecture.,2016

Cited by 2 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3