Full error analysis for the training of deep neural networks-Reference-Cited by-同舟云学术

Full error analysis for the training of deep neural networks

Published:2022-04-05 Issue:02 Volume:25 Page:
ISSN:0219-0257
Container-title:Infinite Dimensional Analysis, Quantum Probability and Related Topics
language:en
Short-container-title:Infin. Dimens. Anal. Quantum. Probab. Relat. Top.

Author:

Beck Christian¹²,Jentzen Arnulf¹²³,Kuckuck Benno²⁴

Affiliation:

1. Seminar for Applied Mathematics, Department of Mathematics, ETH Zürich, Zürich, Switzerland

2. Applied Mathematics: Institute for Analysis and Numerics, Faculty of Mathematics and Computer Science, University of Münster, Münster, Germany

3. School of Data Science and Shenzhen Research Institute of Big Data, The Chinese University of Hong Kong, Shenzhen, China

4. Institute of Mathematics, University of Düsseldorf, Düsseldorf, Germany

Abstract

Deep learning algorithms have been applied very successfully in recent years to a range of problems out of reach for classical solution paradigms. Nevertheless, there is no completely rigorous mathematical error and convergence analysis which explains the success of deep learning algorithms. The error of a deep learning algorithm can in many situations be decomposed into three parts, the approximation error, the generalization error, and the optimization error. In this work we estimate for a certain deep learning algorithm each of these three errors and combine these three error estimates to obtain an overall error analysis for the deep learning algorithm under consideration. In particular, we thereby establish convergence with a suitable convergence speed for the overall error of the deep learning algorithm under consideration. Our convergence speed analysis is far from optimal and the convergence speed that we establish is rather slow, increases exponentially in the dimensions, and, in particular, suffers from the curse of dimensionality. The main contribution of this work is, instead, to provide a full error analysis (i) which covers each of the three different sources of errors usually emerging in deep learning algorithms and (ii) which merges these three sources of errors into one overall error estimate for the considered deep learning algorithm.

Funder

Germany's Excellence Strategy

Publisher

World Scientific Pub Co Pte Ltd

Subject

Applied Mathematics,Mathematical Physics,Statistics and Probability,Statistical and Nonlinear Physics

Link

https://www.worldscientific.com/doi/pdf/10.1142/S021902572150020X

Reference78 articles.

1. Proc. Mach. Learn. Res.;Allen-Zhu Z.

2. Universal approximation bounds for superpositions of a sigmoidal function

Cited by 12 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Deep learning based on randomized quasi-Monte Carlo method for solving linear Kolmogorov partial differential equation;Journal of Computational and Applied Mathematics;2024-12

2. Segmentation of Breast Cancer Masses in Mammography Images Using Deep Convolutional Neural Network (DCNN);2024-08-08

3. Learning smooth functions in high dimensions;Handbook of Numerical Analysis;2024

4. Error analysis for deep neural network approximations of parametric hyperbolic conservation laws;Mathematics of Computation;2023-12-15

5. Overall error analysis for the training of deep neural networks via stochastic gradient descent with random initialisation;Applied Mathematics and Computation;2023-10