High-dimensional semi-supervised learning: in search of optimal inference of the mean-Reference-Cited by-同舟云学术

High-dimensional semi-supervised learning: in search of optimal inference of the mean

Published:2021-09-14 Issue: Volume: Page:
ISSN:0006-3444
Container-title:Biometrika
language:en
Short-container-title:

Author:

Zhang Yuqian¹,Bradic Jelena¹

Affiliation:

1. Department of Mathematics, University of California San Diego, 9500 Gilman Drive, La Jolla, California 92093-0112, U.S.A

Abstract

Abstract A fundamental challenge in semi-supervised learning lies in the observed data’s disproportional size when compared with the size of the data collected with missing outcomes. An implicit understanding is that the dataset with missing outcomes, being significantly larger, ought to improve estimation and inference. However, it is unclear to what extent this is correct. We illustrate one clear benefit: root-n inference of the outcome’s mean is possible while only requiring a consistent estimation of the outcome, possibly at a rate slower than root-n. This is achieved by a novel k-fold cross-fitted, double robust estimator. We discuss both linear and nonlinear outcomes. Such an estimator is particularly suited for models that naturally do not admit root-n consistency, such as high-dimensional, nonparametric, or semiparametric models. We apply our methods to the heterogeneous treatment effects.

Publisher

Oxford University Press (OUP)

Subject

Applied Mathematics,Statistics, Probability and Uncertainty,General Agricultural and Biological Sciences,Agricultural and Biological Sciences (miscellaneous),General Mathematics,Statistics and Probability

Link

https://academic.oup.com/biomet/advance-article-pdf/doi/10.1093/biomet/asab042/42112843/asab042.pdf

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Semi-supervised estimation for the varying coefficient regression model;AIMS Mathematics;2024

2. Optimal Subsampling via Predictive Inference;Journal of the American Statistical Association;2023-11-14

3. Prediction-powered inference;Science;2023-11-10

4. Double robust semi-supervised inference for the mean: selection bias under MAR labeling with decaying overlap;Information and Inference: A Journal of the IMA;2023-04-27