Neural ODEs as the deep limit of ResNets with constant weights-Reference-Cited by-同舟云学术

Neural ODEs as the deep limit of ResNets with constant weights

Published:2020-05-20 Issue:03 Volume:19 Page:397-437
ISSN:0219-5305
Container-title:Analysis and Applications
language:en
Short-container-title:Anal. Appl.

Author:

Avelin Benny¹,Nyström Kaj¹

Affiliation:

1. Department of Mathematics, Uppsala University, S-751 06 Uppsala, Sweden

Abstract

In this paper, we prove that, in the deep limit, the stochastic gradient descent on a ResNet type deep neural network, where each layer shares the same weight matrix, converges to the stochastic gradient descent for a Neural ODE and that the corresponding value/loss functions converge. Our result gives, in the context of minimization by stochastic gradient descent, a theoretical foundation for considering Neural ODEs as the deep limit of ResNets. Our proof is based on certain decay estimates for associated Fokker–Planck equations.

Publisher

World Scientific Pub Co Pte Lt

Subject

Applied Mathematics,Analysis

Link

https://www.worldscientific.com/doi/pdf/10.1142/S0219530520400023

Reference34 articles.

1. On uniqueness problems related to the Fokker–Planck–Kolmogorov equation for measures

2. Deep relaxation: partial differential equations for optimizing deep neural networks

3. Stochastic Gradient Descent Performs Variational Inference, Converges to Limit Cycles for Deep Networks

Cited by 12 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Unification of symmetries inside neural networks: transformer, feedforward and neural ODE;Machine Learning: Science and Technology;2024-06-01

2. Financial Time Series Prediction via Neural Ordinary Differential Equations Approach;2023 International Annual Conference on Complex Systems and Intelligent Science (CSIS-IAC);2023-10-20

3. Learning subgrid-scale models with neural ordinary differential equations;Computers & Fluids;2023-07

4. A probability approximation framework: Markov process approach;The Annals of Applied Probability;2023-04-01

5. A measure theoretical approach to the mean-field maximum principle for training NeurODEs;Nonlinear Analysis;2023-02