Author:
Jing Junling,Jinhang Cai,Zhang Huisheng,Zhang Wenxia
Abstract
AbstractDropout is perhaps the most popular regularization method for deep learning. Due to the stochastic nature of the Dropout mechanism, the convergence analysis of Dropout learning is challenging and the existing convergence results are mainly of probability nature. In this paper, we investigate the deterministic convergence of the mini-batch gradient learning method with Dropconnect and penalty. By drawing and presenting a set of samples of the mask matrix of Dropconnect regularization into the learning process in a cyclic manner, we establish an upper bound of the norm of the weight vector sequence and prove that the gradient of the cost function, the cost function itself, and the weight vector sequence deterministically converge to zero, a constant, and a fixed point respectively. Considering Dropout is mathematically a specific realization of Dropconnect, the established theoretical results in this paper are also valid for Dropout learning. Illustrative simulations on the MNIST dataset are provided to verify the theoretical analysis.
Funder
National Natural Science Foundation of China
Publisher
Springer Science and Business Media LLC