Author:
Ying Hejie,Song Mengmeng,Tang Yaohong,Xiao Shungen,Xiao Zimin
Abstract
AbstractDeep neural networks have achieved remarkable success in various fields. However, training an effective deep neural network still poses challenges. This paper aims to propose a method to optimize the training effectiveness of deep neural networks, with the goal of improving their performance. Firstly, based on the observation that parameters (weights and bias) of deep neural network change in certain rules during training process, the potential of parameters prediction for improving training efficiency is discovered. Secondly, the potential of parameters prediction to improve the performance of deep neural network by noise injection introduced by prediction errors is revealed. And then, considering the limitations comprehensively, a deep neural network Parameters Linear Prediction method is exploit. Finally, performance and hyperparameter sensitivity validations are carried out on some representative backbones. Experimental results show that by employing proposed Parameters Linear Prediction method, as opposed to SGD, has led to an approximate 1% increase in accuracy for optimal model, along with a reduction of about 0.01 in top-1/top-5 error. Moreover, it also exhibits stable performance under various hyperparameter settings, shown the effectiveness of the proposed method and validated its capacity in enhancing network’s training efficiency and performance.
Funder
Key Technology Innovation Project of Fujian Province
Youth and the collaborative innovation center project of Ningde Normal University
Publisher
Springer Science and Business Media LLC
Reference30 articles.
1. Lecun, Y. & Bottou, L. Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324. https://doi.org/10.1109/5.726791 (1998).
2. Hinton, G. E., Osindero, S. & Teh, Y. W. A fast learning algorithm for deep belief nets. Neural Computation. 18(7), 1527–1554. https://doi.org/10.1162/neco.2006.18.7.1527 (2006).
3. Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 2012, 25.
4. Simonyan, K. & Zisserman, A. Very deep convolutional networks for large-scale image recognition. Comput. Sci. https://doi.org/10.48550/arXiv.1409.1556 (2014).
5. Szegedy, C. et al. Going deeper with convolutions. IEEE Comput. Soc. https://doi.org/10.1109/CVPR.2015.7298594 (2014).