Natural Gradient Works Efficiently in Learning-Reference-Cited by-同舟云学术

Natural Gradient Works Efficiently in Learning

Published:1998-02-01 Issue:2 Volume:10 Page:251-276
ISSN:0899-7667
Container-title:Neural Computation
language:en
Short-container-title:Neural Computation

Author:

Amari Shun-ichi¹

Affiliation:

1. RIKEN Frontier Research Program, Saitama 351-01, Japan

Abstract

When a parameter space has a certain underlying structure, the ordinary gradient of a function does not represent its steepest direction, but the natural gradient does. Information geometry is used for calculating the natural gradients in the parameter space of perceptrons, the space of matrices (for blind source separation), and the space of linear dynamical systems (for blind source deconvolution). The dynamical behavior of natural gradient online learning is analyzed and is proved to be Fisher efficient, implying that it has asymptotically the same performance as the optimal batch estimation of parameters. This suggests that the plateau phenomenon, which appears in the backpropagation learning algorithm of multilayer perceptrons, might disappear or might not be so serious when the natural gradient is used. An adaptive method of updating the learning rate is proposed and analyzed.

Publisher

MIT Press - Journals

Subject

Cognitive Neuroscience,Arts and Humanities (miscellaneous)

Link

https://www.mitpressjournals.org/doi/pdf/10.1162/089976698300017746

Reference22 articles.

1. Neural theory of association and concept-formation

2. Differential geometry of a parametric family of invertible linear systems—Riemannian metric, dual affine connections, and divergence

3. A universal theorem on learning curves

4. Information geometry

Cited by 1533 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Probabilistic resistance predictions of laterally restrained cellular steel beams by natural gradient boosting;Thin-Walled Structures;2024-12

2. A Kaczmarz-inspired approach to accelerate the optimization of neural network wavefunctions;Journal of Computational Physics;2024-11

3. Fermi Machine — Quantum Many-Body Solver Derived from Correspondence between Noninteracting and Strongly Correlated Fermions;Journal of the Physical Society of Japan;2024-10-15

4. Passive underwater tracking with unknown measurement noise statistics using variational Bayesian approximation;Digital Signal Processing;2024-10

5. A Novel Underwater Wireless Optical Communication Optical Receiver Decision Unit Strategy Based on a Convolutional Neural Network;Mathematics;2024-09-10