Author:
Cardarilli Gian Carlo,Di Nunzio Luca,Fazzolari Rocco,Giardino Daniele,Nannarelli Alberto,Re Marco,Spanò Sergio
Abstract
AbstractIn this work a novel architecture, named pseudo-softmax, to compute an approximated form of the softmax function is presented. This architecture can be fruitfully used in the last layer of Neural Networks and Convolutional Neural Networks for classification tasks, and in Reinforcement Learning hardware accelerators to compute the Boltzmann action-selection policy. The proposed pseudo-softmax design, intended for efficient hardware implementation, exploits the typical integer quantization of hardware-based Neural Networks obtaining an accurate approximation of the result. In the paper, a detailed description of the architecture is given and an extensive analysis of the approximation error is performed by using both custom stimuli and real-world Convolutional Neural Networks inputs. The implementation results, based on CMOS standard-cell technology, compared to state-of-the-art architectures show reduced approximation errors.
Publisher
Springer Science and Business Media LLC
Reference32 articles.
1. Bishop, C. M. Pattern Recognition and Machine Learning, Chapter 2 (Springer, 2006).
2. Capra, M. et al. An updated survey of efficient hardware architectures for accelerating deep convolutional neural networks. Future Internet 12, 113 (2020).
3. Hubara, I., Courbariaux, M., Soudry, D., El-Yaniv, R. & Bengio, Y. Quantized neural networks: Training neural networks with low precision weights and activations. J. Mach. Learn. Res. 18, 6869–6898 (2017).
4. Guo, K., Zeng, S., Yu, J., Wang, Y. & Yang, H. [dl] a survey of fpga-based neural network inference accelerators. ACM Trans. Reconfigurable Technol. Syst. (TRETS) 12, 1–26 (2019).
5. Alwani, M., Chen, H., Ferdman, M. & Milder, P. Fused-layer CNN accelerators. in 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), 1–12 (IEEE, 2016).
Cited by
49 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Environmental-Impact-Based Multi-Agent Reinforcement Learning;Applied Sciences;2024-07-24
2. Exploring Variable Latency Dividers in Vector Hardware Accelerators;2024 19th Conference on Ph.D Research in Microelectronics and Electronics (PRIME);2024-06-09
3. A Hardware-Friendly Alternative to Softmax Function and Its Efficient VLSI Implementation for Deep Learning Applications;2024 IEEE International Symposium on Circuits and Systems (ISCAS);2024-05-19
4. 8-bit Transformer Inference and Fine-tuning for Edge Accelerators;Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3;2024-04-27
5. AttBind: Memory-Efficient Acceleration for Long-Range Attention Using Vector-Derived Symbolic Binding;2024 Design, Automation & Test in Europe Conference & Exhibition (DATE);2024-03-25