A Kernel-Expanded Stochastic Neural Network-Reference-Cited by-同舟云学术

A Kernel-Expanded Stochastic Neural Network

Published:2022-03-17 Issue:2 Volume:84 Page:547-578
ISSN:1369-7412
Container-title:Journal of the Royal Statistical Society Series B: Statistical Methodology
language:en
Short-container-title:

Author:

Sun Yan¹²,Liang Faming¹²

Affiliation:

1. Department of Statistics , , West Lafayette, IN , USA

2. Purdue University , , West Lafayette, IN , USA

Abstract

Abstract The deep neural network suffers from many fundamental issues in machine learning. For example, it often gets trapped into a local minimum in training, and its prediction uncertainty is hard to be assessed. To address these issues, we propose the so-called kernel-expanded stochastic neural network (K-StoNet) model, which incorporates support vector regression as the first hidden layer and reformulates the neural network as a latent variable model. The former maps the input vector into an infinite dimensional feature space via a radial basis function kernel, ensuring the absence of local minima on its training loss surface. The latter breaks the high-dimensional non-convex neural network training problem into a series of low-dimensional convex optimization problems, and enables its prediction uncertainty easily assessed. The K-StoNet can be easily trained using the imputation-regularized optimization algorithm. Compared to traditional deep neural networks, K-StoNet possesses a theoretical guarantee to asymptotically converge to the global optimum and enables the prediction uncertainty easily assessed. The performances of the new model in training, prediction and uncertainty quantification are illustrated by simulated and real data examples.

Publisher

Oxford University Press (OUP)

Subject

Statistics, Probability and Uncertainty,Statistics and Probability

Link

https://onlinelibrary.wiley.com/doi/pdf/10.1111/rssb.12496

Reference67 articles.

1. A convergence theory for deep learning via over-parameterization;Allen-Zhu,2019

2. Lagrangian support vector regression via unconstrained convex minimization;Balasundaram;Neural Networks,2013

3. A representer theorem for deep kernel learning;Bohn;Journal of Machine Learning Research,2019

4. The sem algorithm: a probabilistic teacher algorithm derived from the em algorithm for the mixture problem;Celeux;Computational Statistics Quarterly,1985

5. Underdamped langevin mcmc: a non-asymptotic analysis;Cheng,2018

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Density regression and uncertainty quantification with Bayesian deep noise neural networks;Stat;2023-01