Flatten-T Swish: a thresholded ReLU-Swish-like activation function for deep learning-Reference-Cited by-同舟云学术

Flatten-T Swish: a thresholded ReLU-Swish-like activation function for deep learning

Published:2018-07-31 Issue:2 Volume:4 Page:76
ISSN:2548-3161
Container-title:International Journal of Advances in Intelligent Informatics
language:
Short-container-title:Int. J. Adv. Intell. Informatics

Author:

Chieng Hock Hung,Wahid Noorhaniza,Pauline Ong,Perla Sai Raj Kishore

Abstract

Activation functions are essential for deep learning methods to learn and perform complex tasks such as image classification. Rectified Linear Unit (ReLU) has been widely used and become the default activation function across the deep learning community since 2012. Although ReLU has been popular, however, the hard zero property of the ReLU has heavily hindering the negative values from propagating through the network. Consequently, the deep neural network has not been benefited from the negative representations. In this work, an activation function called Flatten-T Swish (FTS) that leverage the benefit of the negative values is proposed. To verify its performance, this study evaluates FTS with ReLU and several recent activation functions. Each activation function is trained using MNIST dataset on five different deep fully connected neural networks (DFNNs) with depth vary from five to eight layers. For a fair evaluation, all DFNNs are using the same configuration settings. Based on the experimental results, FTS with a threshold value, T=-0.20 has the best overall performance. As compared with ReLU, FTS (T=-0.20) improves MNIST classification accuracy by 0.13%, 0.70%, 0.67%, 1.07% and 1.15% on wider 5 layers, slimmer 5 layers, 6 layers, 7 layers and 8 layers DFNNs respectively. Apart from this, the study also noticed that FTS converges twice as fast as ReLU. Although there are other existing activation functions are also evaluated, this study elects ReLU as the baseline activation function.

Funder

Office for Research, Innovation, Commercialization and Consultancy Management (ORICC), Postgraduate Research Grant (GPPS) under Vot U817 and Universiti Tun Hussein Onn Malaysia (UTHM)

Publisher

Universitas Ahmad Dahlan, Kampus 3

Subject

Artificial Intelligence,Computer Vision and Pattern Recognition,Human-Computer Interaction

Cited by 24 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Bayesian Optimization for Sparse Neural Networks With Trainable Activation Functions;IEEE Transactions on Pattern Analysis and Machine Intelligence;2024-10

2. Malaria Parasite Detection in Microscopic Blood Smear Images Using Deep Learning Techniques;2024 International Joint Conference on Neural Networks (IJCNN);2024-06-30

3. An academic recommender system on large citation data based on clustering, graph modeling and deep learning;Knowledge and Information Systems;2024-04-18

4. modSwish: a new activation function for neural network;Evolutionary Intelligence;2024-02-07

5. Activation Function Conundrums in the Modern Machine Learning Paradigm;2023 International Conference on Computer and Applications (ICCA);2023-11-28