Affiliation:
1. The Ohio State University, Neil Ave., Columbus, OH, USA
2. University of Michigan, Dearborn, MI, USA
Abstract
Recurrent Neural Networks (RNNs) are an important class of neural networks designed to retain and incorporate context into current decisions. RNNs are particularly well suited for machine learning problems in which context is important, such as speech recognition and language translation.
This work presents RNNFast, a hardware accelerator for RNNs that leverages an emerging class of non-volatile memory called domain-wall memory (DWM). We show that DWM is very well suited for RNN acceleration due to its very high density and low read/write energy. At the same time, the sequential nature of input/weight processing of RNNs mitigates one of the downsides of DWM, which is the linear (rather than constant) data access time.
RNNFast is very efficient and highly scalable, with flexible mapping of logical neurons to RNN hardware blocks. The basic hardware primitive, the RNN processing element (PE), includes custom DWM-based multiplication, sigmoid and tanh units for high density and low energy. The accelerator is designed to minimize data movement by closely interleaving DWM storage and computation. We compare our design with a state-of-the-art GPGPU and find 21.8× higher performance with 70× lower energy.
Funder
National Science Foundation
Publisher
Association for Computing Machinery (ACM)
Subject
Electrical and Electronic Engineering,Hardware and Architecture,Software
Cited by
5 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. RansomShield: A Visualization Approach to Defending Mobile Systems Against Ransomware;ACM Transactions on Privacy and Security;2023-03-13
2. Voice Keyword Spotting on Edge Devices;2022 5th International Conference on Multimedia, Signal Processing and Communication Technologies (IMPACT);2022-11-26
3. Keyword Spotting with Deep Neural Network on Edge Devices;2022 IEEE 12th International Conference on Electronics Information and Emergency Communication (ICEIEC);2022-07-15
4. Fast-track cache;Proceedings of the 36th ACM International Conference on Supercomputing;2022-06-28
5. Low power multiplier based long short-term memory hardware architecture for smart grid energy management;International Journal of System Assurance Engineering and Management;2022-04-15