Author:
Yoshida Yuki,Okada Masato
Abstract
Abstract
The plateau phenomenon, wherein the loss value stops decreasing during the process of learning, has been reported by various researchers. The phenomenon was actively inspected in the 1990s and found to be due to the fundamental hierarchical structure of neural network models. Then, the phenomenon has been thought of as inevitable. However, the phenomenon seldom occurs in the context of recent deep learning. There is a gap between theory and reality. In this paper, using statistical mechanical formulation, we clarified the relationship between the plateau phenomenon and the statistical property of the data learned. It is shown that the data whose covariance has small and dispersed eigenvalues tend to make the plateau phenomenon inconspicuous.
Subject
Statistics, Probability and Uncertainty,Statistics and Probability,Statistical and Nonlinear Physics
Reference16 articles.
1. Dynamics of learning in multilayer perceptrons near singularities;Cousseau;IEEE Trans. Neural Netw.,2008
2. Local minima and plateaus in hierarchical structures of multilayer perceptrons;Fukumizu;Neural Netw.,2000
3. Dynamics of stochastic gradient descent for two-layer neural networks in the teacher–student setup;Goldt,2019
4. Influence area of overlap singularity in multilayer perceptrons;Guo;IEEE Access,2018
5. Analysis of dropout learning regarded as ensemble learning;Hara,2016
Cited by
5 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献