HW-ADAM: FPGA-Based Accelerator for Adaptive Moment Estimation-Reference-Cited by-同舟云学术

HW-ADAM: FPGA-Based Accelerator for Adaptive Moment Estimation

Published:2023-01-04 Issue:2 Volume:12 Page:263
ISSN:2079-9292
Container-title:Electronics
language:en
Short-container-title:Electronics

Author:

Zhang Weiyi^ORCID,Niu Liting,Zhang Debing,Wang Guangqi,Farrukh Fasih Ud Din^ORCID,Zhang Chun^ORCID

Abstract

The selection of the optimizer is critical for convergence in the field of on-chip training. As one second moment optimizer, adaptive moment estimation (ADAM) shows a significant advantage compared with non-moment optimizers such as stochastic gradient descent (SGD) and first-moment optimizers such as Momentum. However, ADAM is hard to implement on hardware due to the computationally intensive operations, including square, root extraction, and division. This work proposed Hardware-ADAM (HW-ADAM), an efficient fixed-point accelerator for ADAM highlighting hardware-oriented mathematical optimizations. HW-ADAM has two designs: Efficient-ADAM (E-ADAM) unit reduced the hardware resource consumption by around 90% compared with the related work. E-ADAM achieved a throughput of 2.89 MUOP/s (Million Updating Operation per Second), which is 2.8× of the original ADAM. Fast-ADAM (F-ADAM) unit reduced 91.5% flip-flops, 65.7% look-up tables, and 50% DSPs compared with the related work. The F-ADAM unit achieved a throughput of 16.7 MUOP/s, which is 16.4× of the original ADAM.

Funder

National Natural Science Foundation of China

Publisher

MDPI AG

Subject

Electrical and Electronic Engineering,Computer Networks and Communications,Hardware and Architecture,Signal Processing,Control and Systems Engineering

Link

https://www.mdpi.com/2079-9292/12/2/263/pdf

Reference37 articles.

1. Imagenet classification with deep convolutional neural networks;Krizhevsky;Commun. ACM,2017

2. Chen, C., Seff, A., Kornhauser, A., and Xiao, J. (2015, January 7–13). Deepdriving: Learning affordance for direct perception in autonomous driving. Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile.

3. Faster r-cnn: Towards real-time object detection with region proposal networks;Ren;IEEE Trans. Pattern Anal. Mach. Intell.,2017

4. A multirange architecture for collision-free off-road robot navigation;Sermanet;J. Field Robot.,2009

5. Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., and Bengio, Y. (2014). Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv.

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Threshold and real-time initiation mechanism of urban flood emergency response under combined disaster scenarios;Sustainable Cities and Society;2024-08

2. KGTLIR: An Air Target Intention Recognition Model Based on Knowledge Graph and Deep Learning;Computers, Materials & Continua;2024

3. Hardware-Software Co-Design of Matrix-Solving for Non-Linear Optimization in SLAM Systems;IECON 2023- 49th Annual Conference of the IEEE Industrial Electronics Society;2023-10-16