Diffusion Approximations for the Constant Learning Rate Backpropagation Algorithm and Resistance to Local Minima-Reference-Cited by-同舟云学术

Diffusion Approximations for the Constant Learning Rate Backpropagation Algorithm and Resistance to Local Minima

Published:1994-03 Issue:2 Volume:6 Page:285-295
ISSN:0899-7667
Container-title:Neural Computation
language:en
Short-container-title:Neural Computation

Author:

Finnoff William¹

Affiliation:

1. Siemens AG, Corporate Research and Development, Otto-Hahn-Ring 6, D-8000 Munich 83, Germany

Abstract

In this paper we discuss the asymptotic properties of the most commonly used variant of the backpropagation algorithm in which network weights are trained by means of a local gradient descent on examples drawn randomly from a fixed training set, and the learning rate η of the gradient updates is held constant (simple backpropagation). Using stochastic approximation results, we show that for η → 0 this training process approaches a batch training. Further, we show that for small η one can approximate simple backpropagation by the sum of a batch training process and a gaussian diffusion, which is the unique solution to a linear stochastic differential equation. Using this approximation we indicate the reasons why simple backpropagation is less likely to get stuck in local minima than the batch training process and demonstrate this empirically on a number of examples.

Publisher

MIT Press - Journals

Subject

Cognitive Neuroscience,Arts and Humanities (miscellaneous)

Link

https://www.mitpressjournals.org/doi/pdf/10.1162/neco.1994.6.2.285

Reference5 articles.

1. Improving model selection by nonconvergent methods

2. Convergence of learning algorithms with constant learning rates

3. Théorèmes de convergence presque sure pour une classe d'algorithmes stochastiques à pas décroissant

4. Some Asymptotic Results for Learning in Single Hidden-Layer Feedforward Network Models

5. Learning in Artificial Neural Networks: A Statistical Perspective

Cited by 37 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Perceptron: Learning, Generalization, Model Selection, Fault Tolerance, and Role in the Deep Learning Era;Mathematics;2022-12-13

2. Convergence of Online Gradient Method with Momentum for BP Neural Network;Journal of Physics: Conference Series;2021-03-01

3. Convergence Analysis of Multilayer BP Neural Network with Momentum Term;Journal of Physics: Conference Series;2020-10-01

4. Simulating the response of ionization chamber system to 137Cs irradiator using the artificial neural network modeling algorithm;SN Applied Sciences;2020-07-03

5. Multilayer Perceptrons: Architecture and Error Backpropagation;Neural Networks and Statistical Learning;2019