Affiliation:
1. Department of Computer Sciences, University of Wisconsin-Madison, Madison, Wisconsin, USA
2. Department of Pure Mathematics and Mathematical Statistics, University of Cambridge, Cambridge, UK
Abstract
Abstract
Estimating the mean of a probability distribution using i.i.d. samples is a classical problem in statistics, wherein finite-sample optimal estimators are sought under various distributional assumptions. In this paper, we consider the problem of mean estimation when independent samples are drawn from $d$-dimensional non-identical distributions possessing a common mean. When the distributions are radially symmetric and unimodal, we propose a novel estimator, which is a hybrid of the modal interval, shorth and median estimators and whose performance adapts to the level of heterogeneity in the data. We show that our estimator is near optimal when data are i.i.d. and when the fraction of ‘low-noise’ distributions is as small as $\varOmega \left (\frac{d \log n}{n}\right )$, where $n$ is the number of samples. We also derive minimax lower bounds on the expected error of any estimator that is agnostic to the scales of individual data points. Finally, we extend our theory to linear regression. In both the mean estimation and regression settings, we present computationally feasible versions of our estimators that run in time polynomial in the number of data points.
Funder
National Science Foundation
Publisher
Oxford University Press (OUP)
Subject
Applied Mathematics,Computational Theory and Mathematics,Numerical Analysis,Statistics and Probability,Analysis
Reference49 articles.
1. On the asymptotic properties of a simple estimate of the mode;Abraham;ESAIM Probab. Stat.,2004
2. On spectral learning of mixtures of distributions;Achlioptas,2005
3. Efficient algorithms for geometric optimization;Agarwal;ACM Comput. Surv.,1998
4. Learning mixtures of arbitrary Gaussians;Arora,2001
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Near-Optimal Mean Estimation with Unknown, Heteroskedastic Variances;Proceedings of the 56th Annual ACM Symposium on Theory of Computing;2024-06-10
2. Robust empirical risk minimization via Newton’s method;Econometrics and Statistics;2023-07
3. On mean estimation for heteroscedastic random variables;Annales de l'Institut Henri Poincaré, Probabilités et Statistiques;2023-02-01