Information-Corrected Estimation: A Generalization Error Reducing Parameter Estimation Method-Reference-Cited by-同舟云学术

Information-Corrected Estimation: A Generalization Error Reducing Parameter Estimation Method

Published:2021-10-28 Issue:11 Volume:23 Page:1419
ISSN:1099-4300
Container-title:Entropy
language:en
Short-container-title:Entropy

Author:

Dixon Matthew,Ward Tyler^ORCID

Abstract

Modern computational models in supervised machine learning are often highly parameterized universal approximators. As such, the value of the parameters is unimportant, and only the out of sample performance is considered. On the other hand much of the literature on model estimation assumes that the parameters themselves have intrinsic value, and thus is concerned with bias and variance of parameter estimates, which may not have any simple relationship to out of sample model performance. Therefore, within supervised machine learning, heavy use is made of ridge regression (i.e., L2 regularization), which requires the the estimation of hyperparameters and can be rendered ineffective by certain model parameterizations. We introduce an objective function which we refer to as Information-Corrected Estimation (ICE) that reduces KL divergence based generalization error for supervised machine learning. ICE attempts to directly maximize a corrected likelihood function as an estimator of the KL divergence. Such an approach is proven, theoretically, to be effective for a wide class of models, with only mild regularity restrictions. Under finite sample sizes, this corrected estimation procedure is shown experimentally to lead to significant reduction in generalization error compared to maximum likelihood estimation and L2 regularization.

Publisher

MDPI AG

Subject

General Physics and Astronomy

Link

https://www.mdpi.com/1099-4300/23/11/1419/pdf

Reference25 articles.

1. On Information and Sufficiency

2. Berk-Nash Equilibrium: A Framework for Modeling Agents With Misspecified Models

3. Approximate Bayesian Computation with Kullback-Leibler Divergence as Data Discrepancy;Jiang,2018

4. Estimating Divergence Functionals and the Likelihood Ratio by Convex Risk Minimization

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Improved information criteria for Bayesian model averaging in lattice field theory;Physical Review D;2024-01-29

2. Improving the Performance and Stability of TIC and ICE;Entropy;2023-03-16

3. Reliability-Based Design Optimization of Structures Considering Uncertainties of Earthquakes Based on Efficient Gaussian Process Regression Metamodeling;Axioms;2022-02-20