Abstract
Abstract
Mean field theory has been successfully used to analyze deep neural networks (DNN) in the infinite size limit. Given the finite size of realistic DNN, we utilize the large deviation theory and path integral analysis to study the deviation of functions represented by DNN from their typical mean field solutions. The parameter perturbations investigated include weight sparsification (dilution) and binarization, which are commonly used in model simplification, for both ReLU and sign activation functions. We find that random networks with ReLU activation are more robust to parameter perturbations with respect to their counterparts with sign activation, which arguably is reflected in the simplicity of the functions they generate.
Funder
Leverhulme Trust
Engineering and Physical Sciences Research Council
H2020 Marie Skłodowska-Curie Actions
Subject
General Physics and Astronomy,Mathematical Physics,Modelling and Simulation,Statistics and Probability,Statistical and Nonlinear Physics
Reference36 articles.
1. Deep learning
2. Model Compression and Acceleration for Deep Neural Networks: The Principles, Progress, and Challenges
3. Visualizing and understanding convolutional networks;Zeiler,2014
4. Understanding neural networks through deep visualization;Yosinski,2015
5. Understanding deep learning requires rethinking generalization;Zhang,2017
Cited by
7 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献