A Method for Calculating the Derivative of Activation Functions Based on Piecewise Linear Approximation
-
Published:2023-01-04
Issue:2
Volume:12
Page:267
-
ISSN:2079-9292
-
Container-title:Electronics
-
language:en
-
Short-container-title:Electronics
Author:
Liao Xuan, Zhou Tong, Zhang Longlong, Hu XiangORCID, Peng Yuanxi
Abstract
Nonlinear functions are widely used as activation functions in artificial neural networks, which have a great impact on the fitting ability of artificial neural networks. Due to the complexity of the activation function, the computation of the activation function and its derivative requires a lot of computing resources and time during training. In order to improve the computational efficiency of the derivatives of the activation function in the back-propagation of artificial neural networks, this paper proposes a method based on piecewise linear approximation method to calculate the derivative of the activation function. This method is hardware-friendly and universal, it can efficiently compute various nonlinear activation functions in the field of neural network hardware accelerators. In this paper, we use least squares to improve a piecewise linear approximation calculation method that can control the absolute error and get less number of segments or smaller average error, which means fewer hardware resources are required. We use this method to perform a segmented linear approximation to the original or derivative function of the activation function. Both types of activation functions are substituted into a multilayer perceptron for binary classification experiments to verify the effectiveness of the proposed method. Experimental results show that the same or even slightly higher classification accuracy can be achieved by using this method, and the computation time of the back-propagation is reduced by 4–6% compared to the direct calculation of the derivative directly from the function expression using the operator encapsulated in PyTorch. This shows that the proposed method provides an efficient solution of nonlinear activation functions for hardware acceleration of neural networks.
Subject
Electrical and Electronic Engineering,Computer Networks and Communications,Hardware and Architecture,Signal Processing,Control and Systems Engineering
Reference24 articles.
1. Throughput-Optimized FPGA Accelerator for Deep Convolutional Neural Networks;Liu;ACM Trans. Reconfig. Technol. Syst.,2017 2. FPGA-Accelerated Deep Convolutional Neural Networks for High Throughput and Energy Efficiency;Qiao;Concurr. Computat. Pract. Exper.,2017 3. OPU: An FPGA-Based Overlay Processor for Convolutional Neural Networks;Yu;IEEE Trans. VLSI Syst.,2020 4. Li, B., Pandey, S., Fang, H., Lyv, Y., Li, J., Chen, J., Xie, M., Wan, L., Liu, H., and Ding, C. (2020, January 10–12). FTRANS: Energy-Efficient Acceleration of Transformers Using FPGA. Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design, Boston, MA, USA. 5. Lu, S., Wang, M., Liang, S., Lin, J., and Wang, Z. (2020, January 8–11). Hardware Accelerator for Multi-Head Attention and Position-Wise Feed-Forward in the Transformer. Proceedings of the 2020 IEEE 33rd International System-on-Chip Conference (SOCC), Las Vegas, NV, USA.
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|