Author:
Agwa Shady,Prodromakis Themis
Abstract
The applications of the Artificial Intelligence are currently dominating the technology landscape. Meanwhile, the conventional Von Neumann architectures are struggling with the data-movement bottleneck to meet the ever-increasing performance demands of these data-centric applications. Moreover, The vector-matrix multiplication cost, in the binary domain, is a major computational bottleneck for these applications. This paper introduces a novel digital in-memory stochastic computing architecture that leverages the simplicity of the stochastic computing for in-memory vector-matrix multiplication. The proposed architecture incorporates several new approaches including a new stochastic number generator with ideal binary-to-stochastic mapping, a best seeding approach for accurate-enough low stochastic bit-precisions, a hybrid stochastic-binary accumulation approach for vector-matrix multiplication, and the conversion of conventional memory read operations into on-the-fly stochastic multiplication operations with negligible overhead. Thanks to the combination of these approaches, the accuracy analysis of the vector-matrix multiplication benchmark shows that scaling down the stochastic bit-precision from 16-bit to 4-bit achieves nearly the same average error (less than 3%). The derived analytical model of the proposed in-memory stochastic computing architecture demonstrates that the 4-bit stochastic architecture achieves the highest throughput per sub-array (122 Ops/Cycle), which is better than the 16-bit stochastic precision by 4.36x, while still maintaining a small average error of 2.25%.
Subject
Electrical and Electronic Engineering,Computer Science Applications,Biomedical Engineering,Atomic and Molecular Physics, and Optics,Electronic, Optical and Magnetic Materials
Reference27 articles.
1. Challenges hindering memristive neuromorphic hardware from going mainstream;Adam;Nat. Commun.,2018
2. High-density digital RRAM-based memory with bit-line compute capability;Agwa,2022
3. Towards a reconfigurable bit-serial/bit-parallel vector accelerator using in-situ processing-in-SRAM;Al-Hawaj,2020
4. Fast and accurate computation using stochastic circuits;Alaghi,2014
5. Survey of stochastic computing;Alaghi;ACM Trans. Embed. Comput. Syst.,2013
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献