Affiliation:
1. RWTH Aachen, Institute of Electronic Materials II , D-52074 Aachen , Germany
2. Forschungszentrum Jülich, Peter Grünberg Institut 7 , D-52428 Jülich , Germany
Abstract
Abstract
Computation-in-Memory accelerators based on resistive switching devices represent a promising approach to realize future information processing systems. These architectures promise orders of magnitudes lower energy consumption for certain tasks, while also achieving higher throughputs than other special purpose hardware such as GPUs, due to their analog computation nature. Due to device variability issues, however, a single resistive switching cell usually does not achieve the resolution required for the considered applications. To overcome this challenge, many of the proposed architectures use an approach called bit slicing, where generally multiple low-resolution components are combined to realize higher resolution blocks. In this paper, we will present an analog accelerator architecture on the circuit level, which can be used to perform Vector-Matrix-Multiplications or Matrix-Matrix-Multiplications. The architecture consists of the 1T1R crossbar array, the optimized select circuitry and an ADC. The components are designed to handle the variability of the resistive switching cells, which is verified through our verified and physical compact model. We then use this architecture to compare different bit slicing approaches and discuss their tradeoffs.
Reference39 articles.
1. Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, pp. 436–444, 2015. https://doi.org/10.1038/nature14539.
2. V. Sze, Y. Chen, J. Emer, A. Suleiman, and Z. Zhang, “Hardware for machine learning: challenges and opportunities,” in 2018 IEEE Custom Integrated Circuits Conference (CICC), 2018, pp. 1–8.
3. N. P. Jouppi, C. Young, N. Patil, et al.., “In-datacenter performance analysis of a tensor processing unit,” SIGARCH Comput. Archit. News, vol. 45, pp. 1–12, 2017. https://doi.org/10.1145/3140659.3080246.
4. Y. Jiao, L. Han, and X. Long, “Hanguang 800 npu – the ultimate ai inference solution for data centers,” in 2020 IEEE Hot Chips 32 Symposium (HCS), 2020, pp. 1–29.
5. N. C. Thompson, K. H. Greenewald, K. Lee, and G. F. Manso, “The computational limits of deep learning,” arXiv:2007.05558v2, vol. abs/2007.05558, 2020.
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献