Infinite‐width limit of deep linear neural networks

Author:

Chizat Lénaïc1,Colombo Maria1,Fernández‐Real Xavier1,Figalli Alessio2

Affiliation:

1. EPFL SB MATH Institute of Mathematics Lausanne Switzerland

2. Department of Mathematics ETH Zurich Zurich Switzerland

Abstract

AbstractThis paper studies the infinite‐width limit of deep linear neural networks (NNs) initialized with random parameters. We obtain that, when the number of parameters diverges, the training dynamics converge (in a precise sense) to the dynamics obtained from a gradient descent on an infinitely wide deterministic linear NN. Moreover, even if the weights remain random, we get their precise law along the training dynamics, and prove a quantitative convergence result of the linear predictor in terms of the number of parameters. We finally study the continuous‐time limit obtained for infinitely wide linear NNs and show that the linear predictors of the NN converge at an exponential rate to the minimal ‐norm minimizer of the risk.

Funder

Stavros Niarchos Foundation

Agencia Estatal de Investigación

European Research Council

Publisher

Wiley

Reference45 articles.

1. Representations for partially exchangeable arrays of random variables

2. Z.Allen‐Zhu Y.Li andZ.Song A convergence theory for deep learning via over‐parameterization International Conference on Machine Learning PMLR Long Beach California 2019 pp.242–252.

3. S.Arora N.Cohen N.Golowich andW.Hu A convergence analysis of gradient descent for deep linear neural networks International Conference on Learning Representations 2018.

4. Implicit regularization in deep matrix factorization;Arora S.;Adv. Neural Inf. Process. Syst.,2019

5. F.BachandL.Chizat Gradient descent on infinitely wide neural networks: global convergence and generalization ICM—International Congress of Mathematicians vol.7 sections 15–20 pp.5398–5419(2023). DOI 10.4171/icm2022/121

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3