Author:
Thorpe Matthew,van Gennip Yves
Abstract
AbstractNeural networks have been very successful in many applications; we often, however, lack a theoretical understanding of what the neural networks are actually learning. This problem emerges when trying to generalise to new data sets. The contribution of this paper is to show that, for the residual neural network model, the deep layer limit coincides with a parameter estimation problem for a nonlinear ordinary differential equation. In particular, whilst it is known that the residual neural network model is a discretisation of an ordinary differential equation, we show convergence in a variational sense. This implies that optimal parameters converge in the deep layer limit. This is a stronger statement than saying for a fixed parameter the residual neural network model converges (the latter does not in general imply the former). Our variational analysis provides a discrete-to-continuum $$\Gamma $$
Γ
-convergence result for the objective function of the residual neural network training step to a variational problem constrained by a system of ordinary differential equations; this rigorously connects the discrete setting to a continuum problem.
Funder
H2020 European Research Council
Publisher
Springer Science and Business Media LLC
Subject
Applied Mathematics,Computational Mathematics,Mathematics (miscellaneous),Theoretical Computer Science
Reference99 articles.
1. Adams, R. A., Fournier, J. J. F.: Sobolev spaces, volume 140. Elsevier, (2003)
2. Anthony, M.: Discrete mathematics of neural networks: selected topics, volume 8. SIAM, (2001)
3. Bengio, Y., Simard, P., Frasconi, P.: Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 5(2), 157–166 (1994)
4. Bo, L., Capponi, A., Liao, H.: Deep residual learning via large sample mean-field optimization. preprint arXiv:1906.08894v3, (2020)
5. Braides, A.: $$\Gamma $$-Convergence for Beginners. Oxford University Press, (2002)
Cited by
5 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献