Boosted Kernel Weighting – Using Statistical Learning to Improve Inference from Nonprobability Samples-Reference-Cited by-同舟云学术

Boosted Kernel Weighting – Using Statistical Learning to Improve Inference from Nonprobability Samples

Published:2020-12-03 Issue:5 Volume:9 Page:1088-1113
ISSN:2325-0984
Container-title:Journal of Survey Statistics and Methodology
language:en
Short-container-title:

Author:

Kern Christoph¹,Li Yan²,Wang Lingxiao³

Affiliation:

1. PostDoc with the School of Social Sciences, University of Mannheim, A 5, 6, 68159 Mannheim, Baden Wuerttemberg, Germany, and The Joint Program in Survey Methodology, University of Maryland, 1218 LeFrak Hall, 7251 Preinkert Dr., College Park, MD 20742, USA

2. Professor with The Joint Program in Survey Methodology, University of Maryland, 1218 LeFrak Hall, 7251 Preinkert Dr., College Park, MD 20742 USA

3. The Joint Program in Survey Methodology, University of Maryland, 1218 LeFrak Hall, 7251 Preinkert Dr., College Park, MD 20742, USA

Abstract

Abstract Given the growing popularity of nonprobability samples as a cost- and time-efficient alternative to probability sampling, a variety of adjustment approaches have been proposed to correct for self-selection bias in nonrandom samples. Popular methods such as inverse propensity-score weighting (IPSW) and propensity-score (PS) adjustment by subclassification (PSAS) utilize a probability sample as a reference to estimate pseudo-weights for the nonprobability sample based on PSs. A recent contribution, kernel weighting (KW), has been shown to be able to improve over IPSW and PSAS with respect to mean squared error. However, the effectiveness of these methods for reducing bias critically depends on the ability of the underlying propensity model to reflect the true (self-)selection process, which is a challenging task with parametric regression. In this study, we propose a set of pseudo-weights construction methods, KW-ML, utilizing both machine learning (ML) methods (to estimate PSs) and KW (to construct pseudo-weights based on the ML-estimated PSs), which provides added flexibility over logistic regression-based methods. We compare the proposed KW-ML pseudo-weights that are based on model-based recursive partitioning, conditional random forests, gradient tree boosting, and model-based boosting, with KW pseudo-weights based on parametric logistic regression in population mean estimation via simulations and a real data example. Our results indicate that particularly boosting methods represent promising alternatives to logistic regression and result in KW estimates with lower bias in a variety of settings, without increasing variance.

Publisher

Oxford University Press (OUP)

Subject

Applied Mathematics,Statistics, Probability and Uncertainty,Social Sciences (miscellaneous),Statistics and Probability

Link

https://academic.oup.com/jssam/article-pdf/9/5/1088/41727206/smaa028.pdf

Reference52 articles.

1. Balance Diagnostics for Comparing the Distribution of Baseline Covariates between Treatment Groups in Propensity-Score Matched Samples;Austin;Statistics in Medicine,2009

2. A Random Forest Guided Tour;Biau;TEST,2016

3. Boosting Algorithms: Regularization, Prediction and Model Fitting (with Discussion);Bühlmann;Statistical Science,2007

4. Globally Efficient Nonparametric Inference of Average Treatment Effects by Empirical Balancing Calibration Weighting;Chan;Journal of the Royal Statistical Society: Series B,2016

Cited by 11 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A new technique for handling non-probability samples based on model-assisted kernel weighting;Mathematics and Computers in Simulation;2025-01

2. Estimating response propensities in nonprobability surveys using machine learning weighted models;Mathematics and Computers in Simulation;2024-11

3. Nonparticipation Bias in Accelerometer-Based Studies and the Use of Propensity Scores;Social Science Computer Review;2024-05-16

4. Kernel Weighting for blending probability and non-probability survey samples;SORT-STAT OPER RES T;2024

5. Book Review: Silvia Biffignandi and Jelke Bethlehem. Handbook of Web Surveys, 2nd edition. 2021 Wiley, ISBN: 978-1-119-37168-7, 624 pps;Journal of Official Statistics;2023-12-01