Abstract
AbstractImprovements in technology lead to increasing availability of large data sets which makes the need for data reduction and informative subsamples ever more important. In this paper we construct D-optimal subsampling designs for polynomial regression in one covariate for invariant distributions of the covariate. We study quadratic regression more closely for specific distributions. In particular we make statements on the shape of the resulting optimal subsampling designs and the effect of the subsample size on the design. To illustrate the advantage of the optimal subsampling designs we examine the efficiency of uniform random subsampling.
Funder
Deutsche Forschungsgemeinschaft
Publisher
Springer Science and Business Media LLC
Subject
Statistics, Probability and Uncertainty,Statistics and Probability
Reference19 articles.
1. Dereziński M, Warmuth MK (2018) Reverse iterative volume sampling for linear regression. J Mach Learn Res 190(1):853–891
2. Drineas P, Mahoney MW, Muthukrishnan S (2006) Sampling algorithms for $$\ell _2$$ regression and applications. In: Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm, pp 1127–1136
3. Fedorov VV (1989) Optimal design with bounded density: optimization algorithms of the exchange type. J Stat Plan Inference 220(1):1–13
4. Gaffke N, Heiligers B (1996) Approximate designs for polynomial regression: invariance, admissibility, and optimality. In: Ghosh S, Rao CR (eds) Handbook of statistics, vol 13. Elsevier, Amsterdam, pp 1149–1199
5. Hasselman B (2018) nleqslv: solve systems of nonlinear equations. R package version 3.3.2. https://CRAN.R-project.org/package=nleqslv