A convergence analysis of Nesterov’s accelerated gradient method in training deep linear neural networks-Reference-Cited by-同舟云学术

A convergence analysis of Nesterov’s accelerated gradient method in training deep linear neural networks

Published:2022-10 Issue: Volume:612 Page:898-925
ISSN:0020-0255
Container-title:Information Sciences
language:en
Short-container-title:Information Sciences

Author:

Liu Xin,Tao Wei,Pan Zhisong

Funder

National Natural Science Foundation of China

Publisher

Elsevier BV

Subject

Artificial Intelligence,Information Systems and Management,Computer Science Applications,Theoretical Computer Science,Control and Systems Engineering,Software

Reference48 articles.

1. Tensorflow: A system for large-scale machine learning;Abadi,2016

2. Arora, S., Cohen, N., Golowich, N., Hu, W., 2019a. A convergence analysis of gradient descent for deep linear neural networks, in: International Conference on Learning Representations. https://openreview.net/forum?id=SkMQg3C5K7.

3. Fine-grained analysis of optimization and generalization for overparameterized two-layer neural networks;Arora,2019

4. Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al., 2020. Language models are few-shot learners, in: Advances in Neural Information Processing Systems. https://proceedings.neurips.cc/paper/2020/hash/1457c0d6bfcb4967418bfb8ac142f64a-Abstract.html.

5. A dynamical view on optimization algorithms of overparameterized neural networks;Bu,2021

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Sensitivity Analysis of Gas Consumption in Gas Turbine Combined Cycle;2023 3rd International Conference on Electrical Engineering and Control Science (IC2ECS);2023-12-29

2. Infrared imaging segmentation employing an explainable deep neural network;Turkish Journal of Electrical Engineering and Computer Sciences;2023-10-07