An Optimization Strategy Based on Hybrid Algorithm of Adam and SGD-Reference-Cited by-同舟云学术

An Optimization Strategy Based on Hybrid Algorithm of Adam and SGD

Published:2018 Issue: Volume:232 Page:03007
ISSN:2261-236X
Container-title:MATEC Web of Conferences
language:
Short-container-title:MATEC Web Conf.

Author:

Wang Yijun,Zhou Pengyu,Zhong Wenya

Abstract

Despite superior training outcomes, adaptive optimization methods such as Adam, Adagrad or RMSprop have been found to generalize poorly compared to stochastic gradient descent (SGD). So scholars (Nitish Shirish Keskar et al., 2017) proposed a hybrid strategy to start training with Adam and switch to SGD at the right time. In the learning task with a large output space, it was observed that Adam could not converge to an optimal solution (or could not converge to an extreme point in a non-convex scene) [1]. Therefore, this paper proposes a new variant of the ADAM algorithm (AMSGRAD), which not only solves the convergence problem, but also improves the empirical performance.

Publisher

EDP Sciences

Subject

General Medicine

Link

https://www.matec-conferences.org/10.1051/matecconf/201823203007/pdf

Reference9 articles.

1. Robbins Herbert and Monro Sutton. A stochastic approximation method. The annals of mathematical statistics,pp. 400–407, 1951.

2. Kingma D. and Ba J. Adam: A method for stochastic optimization. In International Conference on Learning Representations (ICLR 2015), 2015

3. Tieleman T. and Hinton G. Lecture 6.5-RMSProp: Divide the gradient by a running average of its recent magni-tude. COURSERA:Neural Networks for Machine Learning, 4, 2012.

Cited by 12 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Improved End-to-End Wireless Transmission Integrating NOMA and DL-Based Autoencoder;IETE Journal of Research;2024-01-21

2. Reconstruction and generation of 3D realistic soil particles with metaball descriptor;Computers and Geotechnics;2023-09

3. An Efficient Optimization Technique for Training Deep Neural Networks;Mathematics;2023-03-10

4. Life Prediction of Wind Turbine Based on Attention-BiGRU;Mechanisms and Machine Science;2023

5. Autoencoder for end-to-end learning communication system based on NOMA;2022 IEEE 6th Conference on Information and Communication Technology (CICT);2022-11-18