Adaptive Morphing Activation Function for Neural Networks-Reference-Cited by-同舟云学术

Adaptive Morphing Activation Function for Neural Networks

Published:2024-07-29 Issue:8 Volume:8 Page:444
ISSN:2504-3110
Container-title:Fractal and Fractional
language:en
Short-container-title:Fractal Fract

Author:

Herrera-Alcántara Oscar¹^ORCID,Arellano-Balderas Salvador²

Affiliation:

1. Departamento de Sistemas, Universidad Autónoma Metropolitana, Azcapotzalco 02200, Mexico

2. Departamento de Ciencias Básicas, Universidad Autónoma Metropolitana, Azcapotzalco 02200, Mexico

Abstract

A novel morphing activation function is proposed, motivated by the wavelet theory and the use of wavelets as activation functions. Morphing refers to the gradual change of shape to mimic several apparently unrelated activation functions. The shape is controlled by the fractional order derivative, which is a trainable parameter to be optimized in the neural network learning process. Given the morphing activation function, and taking only integer-order derivatives, efficient piecewise polynomial versions of several existing activation functions are obtained. Experiments show that the performance of polynomial versions PolySigmoid, PolySoftplus, PolyGeLU, PolySwish, and PolyMish is similar or better than their counterparts Sigmoid, Softplus, GeLU, Swish, and Mish. Furthermore, it is possible to learn the best shape from the data by optimizing the fractional-order derivative with gradient descent algorithms, leading to the study of a more general formula based on fractional calculus to build and adapt activation functions with properties useful in machine learning.

Publisher

MDPI AG

Link

https://www.mdpi.com/2504-3110/8/8/444/pdf

Reference47 articles.

1. A logical calculus of the ideas immanent in nervous activity;McCulloch;Bull. Math. Biophys.,1943

2. The Perceptron: A Probabilistic Model for Information Storage and Organization in the Brain;Rosenblatt;Psychol. Rev.,1958

3. Haykin, S.S. (2009). Neural Networks and Learning Machines, Pearson Education. [3rd ed.].

4. Pascanu, R., Mikolov, T., and Bengio, Y. (2013, January 16–21). On the difficulty of training Recurrent Neural Networks. Proceedings of the IEEE International Conference on Machine Learning (ICML), Atlanta, GA, USA.

5. The Vanishing Gradient Problem During Learning Recurrent Neural Nets and Problem Solutions;Hochreiter;Int. J. Uncertain. Fuzziness Knowl.-Based Syst.,1998