Author:
Blanchet Jose,Kang Yang,Murthy Karthyek
Abstract
AbstractWe show that several machine learning estimators, including square-root least absolute shrinkage and selection and regularized logistic regression, can be represented as solutions to distributionally robust optimization problems. The associated uncertainty regions are based on suitably defined Wasserstein distances. Hence, our representations allow us to view regularization as a result of introducing an artificial adversary that perturbs the empirical distribution to account for out-of-sample effects in loss estimation. In addition, we introduce RWPI (robust Wasserstein profile inference), a novel inference methodology which extends the use of methods inspired by empirical likelihood to the setting of optimal transport costs (of which Wasserstein distances are a particular case). We use RWPI to show how to optimally select the size of uncertainty regions, and as a consequence we are able to choose regularization parameters for these machine learning estimators without the use of cross validation. Numerical experiments are also given to validate our theoretical findings.
Publisher
Cambridge University Press (CUP)
Subject
Statistics, Probability and Uncertainty,General Mathematics,Statistics and Probability
Reference52 articles.
1. Empirical likelihood ratio confidence intervals for a single functional
2. On the rate of convergence in Wasserstein distance of the empirical measure
3. Robust regression and LASSO;Xu;Advances in Neural Information Processing Systems,2009
4. [7] Blanchet, J. and Kang, Y. (2017). Semi-supervised learning based on distributionally robust optimization. Preprint, arXiv:1702.08848.
5. Empirical Likelihood for Linear Models
Cited by
93 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献