DAM SRAM CORE: An Efficient High-Speed and Low-Power CIM SRAM CORE Design for Feature Extraction Convolutional Layers in Binary Neural Networks
-
Published:2024-04-30
Issue:5
Volume:15
Page:617
-
ISSN:2072-666X
-
Container-title:Micromachines
-
language:en
-
Short-container-title:Micromachines
Author:
Zhao Ruiyong12, Gong Zhenghui1, Liu Yulan12, Chen Jing1
Affiliation:
1. Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, Shanghai 200031, China 2. University of Chinese Academy of Sciences, Beijing 100049, China
Abstract
This article proposes a novel design for an in-memory computing SRAM, the DAM SRAM CORE, which integrates storage and computational functionality within a unified 11T SRAM cell and enables the performance of large-scale parallel Multiply–Accumulate (MAC) operations within the SRAM array. This design not only improves the area efficiency of the individual cells but also realizes a compact layout. A key highlight of this design is its employment of a dynamic aXNOR-based computation mode, which significantly reduces the consumption of both dynamic and static power during the computational process within the array. Additionally, the design innovatively incorporates a self-stabilizing voltage gradient quantization circuit, which enhances the computational accuracy of the overall system. The 64 × 64 bit DAM SRAM CORE in-memory computing core was fabricated using the 55 nm CMOS logic process and validated via simulations. The experimental results show that this core can deliver 5-bit output results with 1-bit input feature data and 1-bit weight data, while maintaining a static power consumption of 0.48 mW/mm2 and a computational power consumption of 11.367 mW/mm2. This showcases its excellent low-power characteristics. Furthermore, the core achieves a data throughput of 109.75 GOPS and exhibits an impressive energy efficiency of 21.95 TOPS/W, which robustly validate the effectiveness and advanced nature of the proposed in-memory computing core design.
Funder
Science and Technology Commission of Shanghai Municipality Shanghai Zhangjiang Laboratory
Reference11 articles.
1. Jan, C.-H. (2018, January 16–19). Moore’s law—Predict the unpredictable. Proceedings of the 2018 International Symposium on VLSI Design, Automation and Test (VLSI-DAT), Hsinchu, Taiwan. 2. Eyeriss: An Energy−Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks;Chen;IEEE J. Solid-State Circuits,2017 3. Yang, C., Zhang, H.B., Wang, X.L., and Geng, L. (November, January 31). An Energy−Efficient and Flexible Accelerator based on Reconfigurable Computing for Multiple Deep Convolutional Neural Networks. Proceedings of the 2018 14th IEEE International Conference on Solid−State and Integrated Circuit Technology, Qingdao, China. 4. Jouppi, N.P., Young, C., Patil, N., Patterson, D., Agrawal, G., Bajwa, R., Bates, S., Bhatia, S., Boden, N., and Borchers, A. (2017, January 24–28). In-Datacenter Performance Analysis of a Tensor Processing Unit. Proceedings of the 44th Annual International Symposium on Computer Architecture, Toronto, ON, Canada. 5. Wong, H. (2021, January 12–14). On the CMOS Device Downsizing, More Moore, More than Moore, and More-than-Moore for More Moore. Proceedings of the 2021 IEEE 32nd International Conference on Microelectronics (MIEL), Nis, Serbia.
|
|