Affiliation:
1. Georgia Institute of Technology, Atlanta, GA, US
Abstract
The
emerging non-volatile memory (eNVM)
based mixed-signal
Compute-in-Memory (CIM)
accelerators are of great interest in today's AI accelerators design due to their high energy efficiency. Various CIM architectures and circuit-level designs have been proposed, showing superior hardware performance for
deep neural network (DNN)
acceleration. However, hardware-aware quantization strategies for CIM-based accelerators are not systematically explored. Since there are a variety of design options for neural network mapping on CIM systems while improper strategies may narrow the circuit-level design space and further limit the hardware performance, it is important to make a comprehensive early-stage design space exploration and find appropriate quantization/mapping strategies to achieve better hardware performance. In this paper, we provide a joint algorithm-hardware analysis and compare the system-level hardware performance for various design options, including quantization algorithms, data representation methods and
analog-to-digital converter (ADC)
configurations. This work aims to propose guidelines for choosing more hardware-friendly design options for chip architects. According to our evaluation results for CIFAR-10/100 and ImageNet classification, the properly chosen quantization approach and optimal mapping strategy (dynamic fixed-point quantization + 2’s complement representation/shifted unsigned INT representation + optimized precision ADC) could achieve ∼2 × energy efficiency and 1.2 ∼ 1.6 × throughput with 5%∼25% reduced area overhead, compared to naïve strategy (fixed-point quantization + differential pair number representation + full precision ADC).
Funder
ASCENT, one of the SRC/DARPA JUMP Centers
Publisher
Association for Computing Machinery (ACM)
Subject
Electrical and Electronic Engineering,Computer Graphics and Computer-Aided Design,Computer Science Applications
Reference37 articles.
1. Neuro-Inspired Computing With Emerging Nonvolatile Memorys
2. A methodology to improve linearity of analog RRAM for neuromorphic computing;Wu W.;IEEE Symposium on VLSI Technology (VLSI’18),2018
3. Ferroelectric FET analog synapse for acceleration of deep neural network training;Jerry M.;IEEE International Electron Devices Meeting (IEDM’17),2017
4. ISAAC: A convolutional neural network accelerator with in-situ analog arithmetic in crossbars;Shafiee Ali;ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA),2016
5. PRIME: A novel processing-in-memory architecture for neural network;Chi Ping;International Symposium on Computer Architecture.,2016
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献