A modified Adam algorithm for deep neural network optimization

Author:

Reyad Mohamed,Sarhan Amany M.,Arafa M.

Abstract

AbstractDeep Neural Networks (DNNs) are widely regarded as the most effective learning tool for dealing with large datasets, and they have been successfully used in thousands of applications in a variety of fields. Based on these large datasets, they are trained to learn the relationships between various variables. The adaptive moment estimation (Adam) algorithm, a highly efficient adaptive optimization algorithm, is widely used as a learning algorithm in various fields for training DNN models. However, it needs to improve its generalization performance, especially when training with large-scale datasets. Therefore, in this paper, we propose HN Adam, a modified version of the Adam Algorithm, to improve its accuracy and convergence speed. The HN_Adam algorithm is modified by automatically adjusting the step size of the parameter updates over the training epochs. This automatic adjustment is based on the norm value of the parameter update formula according to the gradient values obtained during the training epochs. Furthermore, a hybrid mechanism was created by combining the standard Adam algorithm and the AMSGrad algorithm. As a result of these changes, the HN_Adam algorithm, like the stochastic gradient descent (SGD) algorithm, has good generalization performance and achieves fast convergence like other adaptive algorithms. To test the proposed HN_Adam algorithm performance, it is evaluated to train a deep convolutional neural network (CNN) model that classifies images using two different standard datasets: MNIST and CIFAR-10. The algorithm results are compared to the basic Adam algorithm and the SGD algorithm, in addition to other five recent SGD adaptive algorithms. In most comparisons, the HN Adam algorithm outperforms the compared algorithms in terms of accuracy and convergence speed. AdaBelief is the most competitive of the compared algorithms. In terms of testing accuracy and convergence speed (represented by the consumed training time), the HN-Adam algorithm outperforms the AdaBelief algorithm by an improvement of 1.0% and 0.29% for the MNIST dataset, and 0.93% and 1.68% for the CIFAR-10 dataset, respectively.

Funder

Tanta University

Publisher

Springer Science and Business Media LLC

Subject

Artificial Intelligence,Software

Reference66 articles.

1. Alzubaidi L, Zhang J, Humaidi AJ (2021) Review of deep learning: concepts, CNN architectures, challenges, applications, future directions. J Big Data 8:53

2. Michael G, Kaldewey T, Tam D (2017) Optimizing the efficiency of deep learning through accelerator virtualization. IBM J Res Dev 61:121–1211. https://doi.org/10.1147/JRD.2017.2716598

3. Maurizio C, Beatrice B, Alberto M, Muhammad S, Guido M (2020) An updated survey of efficient hardware architectures for accelerating deep convolutional neural networks. J Fut Inter 12:113

4. Pouyanfar S, Sadiq S, Yan Y (2018) A survey on deep learning: algorithms, techniques, and applications. ACM Comput Surv 51:5

5. Hassen L, Slim B, Ali L, Chih CH, Lamjed BS (2021) Deep convolutional neural network architecture design as a bi-level optimization problem. J Neuro Comput 439:44–62

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3