Scalable subsampling: computation, aggregation and inference-Reference-Cited by-同舟云学术

Scalable subsampling: computation, aggregation and inference

Published:2023-03-21 Issue: Volume: Page:
ISSN:0006-3444
Container-title:Biometrika
language:en
Short-container-title:

Author:

Politis Dimitris N¹

Affiliation:

1. Department of Mathematics, University of California , San Diego, 9500 Gilman Drive, La Jolla , California 92093-0112, U.S.A

Abstract

Abstract Subsampling has seen a resurgence in the big data era where the standard, full-resample size bootstrap can be infeasible to compute. Nevertheless, even choosing a single random subsample of size b can be computationally challenging with both b and the sample size n being very large. This paper shows how a set of appropriately chosen, nonrandom subsamples can be used to conduct effective, and computationally feasible, subsampling distribution estimation. Furthermore, the same set of subsamples can be used to yield a procedure for subsampling aggregation, also known as subagging, that is scalable with big data. Interestingly, the scalable subagging estimator can be tuned to have the same, or better, rate of convergence than that of θ^n. Statistical inference could then be based on the scalable subagging estimator instead of the original θ^n.

Publisher

Oxford University Press (OUP)

Subject

Applied Mathematics,Statistics, Probability and Uncertainty,General Agricultural and Biological Sciences,Agricultural and Biological Sciences (miscellaneous),General Mathematics,Statistics and Probability

Link

https://academic.oup.com/biomet/advance-article-pdf/doi/10.1093/biomet/asad021/50422703/asad021.pdf

Reference20 articles.

1. Divide and conquer in nonstandard problems and the super-efficiency phenomenon;Banerjee;Ann. Statist.,2019

2. Empirical processes in survey sampling with (conditional) Poisson designs;Bertail;Scand. J. Statist,2017

3. Randomized maximum-contrast selection: subagging for large-scale regression;Bradic;Electron. J. Statist.,2016

4. Analyzing bagging;Bühlmann;Ann. Statist.,2002

5. Distributed statistical inference for massive data;Chen;Ann. Statist.,2021

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Debiasing Welch’s method for spectral density estimation;Biometrika;2024-07-02

2. Extrapolated Cross-Validation for Randomized Ensembles;Journal of Computational and Graphical Statistics;2023-11-27