A Scaling Transition Method from SGDM to SGD with 2ExpLR Strategy-Reference-Cited by-同舟云学术

A Scaling Transition Method from SGDM to SGD with 2ExpLR Strategy

Published:2022-11-24 Issue:23 Volume:12 Page:12023
ISSN:2076-3417
Container-title:Applied Sciences
language:en
Short-container-title:Applied Sciences

Author:

Zeng Kun,Liu Jinlan,Jiang Zhixia^ORCID,Xu Dongpo^ORCID

Abstract

In deep learning, the vanilla stochastic gradient descent (SGD) and SGD with heavy-ball momentum (SGDM) methods have a wide range of applications due to their simplicity and great generalization. This paper uses an exponential scaling method to realize a smooth and stable transition from SGDM to SGD, which combines the advantages of the fast training speed of SGDM and the accurate convergence of SGD (named TSGD). We also provide some theoretical results on the convergence of this algorithm. At the same time, we take advantage of the learning rate warmup strategy’s stability and the learning rate decay strategy’s high accuracy. A warmup–decay learning rate strategy with double exponential functions is proposed (named 2ExpLR). The experimental results on different datasets for the proposed algorithms indicate that the accuracy is improved significantly and that the training is faster and more stable.

Funder

Natural Science Foundation of Jilin Province, China

National Natural Science Foundation of China

National Key R&D Program of China

Fundamental Research Funds for the Central Universities of China

Publisher

MDPI AG

Subject

Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science

Link

https://www.mdpi.com/2076-3417/12/23/12023/pdf

Reference49 articles.

1. Learning representations by back-propagating errors;Nature,1986

2. Handwritten digit recognition with a back-propagation network;Adv. Neural Inf. Process. Syst.,1989

3. Long short-term memory;Neural Comput.,1997

4. He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27–30). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.

5. Huang, G., Liu, Z., Van Der Maaten, L., and Weinberger, K.Q. (2017, January 21–26). Densely connected convolutional networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.