Affiliation:
1. Department of Electronics, Carleton University, 1125 Colonel Bay Drive, Ottawa, ON K1S 5B6, Canada
Abstract
Artificial intelligence (AI) has revolutionized present-day life through automation and independent decision-making capabilities. For AI hardware implementations, the 6T-SRAM cell is a suitable candidate due to its performance edge over its counterparts. However, modern AI hardware such as neural networks (NNs) access off-chip data quite often, degrading the overall system performance. Compute-in-memory (CIM) reduces off-chip data access transactions. One CIM approach is based on the mixed-signal domain, but it suffers from limited bit precision and signal margin issues. An alternate emerging approach uses the all-digital signal domain that provides better signal margins and bit precision; however, it will be at the expense of hardware overhead. We have analyzed digital signal domain CIM silicon-verified 6T-SRAM CIM solutions, after classifying them as SRAM-based accelerators, i.e., near-memory computing (NMC), and custom SRAM-based CIM, i.e., in-memory-computing (IMC). We have focused on multiply and accumulate (MAC) as the most frequent operation in convolution neural networks (CNNs) and compared state-of-the-art implementations. Neural networks with low weight precision, i.e., <12b, show lower accuracy but higher power efficiency. An input precision of 8b achieves implementation requirements. The maximum performance reported is 7.49 TOPS at 330 MHz, while custom SRAM-based performance has shown a maximum of 5.6 GOPS at 100 MHz. The second part of this article analyzes the FinFET 6T-SRAM as one of the critical components in determining overall performance of an AI computing system. We have investigated the FinFET 6T-SRAM cell performance and limitations as dictated by the FinFET technology-specific parameters, such as sizing, threshold voltage (Vth), supply voltage (VDD), and process and environmental variations. The HD FinFET 6T-SRAM cell shows 32% lower read access time and 1.09 times better leakage power as compared with the HC cell configuration. The minimum achievable supply voltage is 600 mV without utilization of any read- or write-assist scheme for all cell configurations, while temperature variations show noise margin deviation of up to 22% of the nominal values.
Funder
CURIE Fund administrated under MacOrdrum Library
Subject
Electrical and Electronic Engineering,Mechanical Engineering,Control and Systems Engineering
Reference87 articles.
1. Efficient Processing of Deep Neural Networks: A Tutorial and Survey;Sze;Proc. IEEE,2017
2. (2021). AI Acceleration: Autonomous is driving by Manouchehr Rafie VP of Advance Technologies, GyrFalcon Technologies Inc.
3. The Challenges and Emerging Technologies for Low Power Artificial Intelligence IoT Systems;Le;IEEE Trans. Circuit Syst. -I Regul. Pap.,2021
4. In Memory computing: Advances and prospects;Verma;IEEE Solid State Circuit Mag.,2019
5. Compute-in-Memory Chips for Deep learning: Recent Trends and Prospects;Yu;IEEE Circuit Syst. Mag.,2021
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献