Affiliation:
1. Robert Bosch GmbH, Renningen, Germany
2. Fraunhofer IPMS, Center Nanoelectronic Technologies, Dresden, Germany
3. University of Kaiserslautern, Kaiserslautern, Germany
Abstract
Today, a large number of applications depend on deep neural networks (DNN) to process data and perform complicated tasks at restricted power and latency specifications. Therefore, processing-in-memory (PIM) platforms are actively explored as a promising approach to improve the throughput and the energy efficiency of DNN computing systems. Several PIM architectures adopt resistive non-volatile memories as their main unit to build crossbar-based accelerators for DNN inference. However, these structures suffer from several drawbacks such as reliability, low accuracy, large ADCs/DACs power consumption and area, high write energy, and so on. In this article, we present a new mixed-signal in-memory architecture based on the bit-decomposition of the multiply and accumulate (MAC) operations. Our in-memory inference architecture uses a single FeFET as a non-volatile memory cell. Compared to the prior work, this system architecture provides a high level of parallelism while using only 3-bit ADCs. Also, it eliminates the need for any DAC. In addition, we provide flexibility and a very high utilization efficiency even for varying tasks and loads. Simulations demonstrate that we outperform state-of-the-art efficiencies with 36.5 TOPS/W and can pack 2.05 TOPS with 8-bit activation and 4-bit weight precision in an area of 4.9 mm
2
using 22 nm FDSOI technology. Employing binary operation, we obtain 1169 TOPS/W and over 261 TOPS/W/mm
2
on system level.
Funder
ECSEL Joint Undertaking project TEMPO in collaboration with the European Union’s H2020 Framework Program
National Authorities
Carl Zeiss Foundation under the grant Sustainable Embedded AI
Publisher
Association for Computing Machinery (ACM)
Subject
Hardware and Architecture,Software
Cited by
14 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献