Affiliation:
1. Division of Biostatistics, Weill Cornell Medicine, New York, NY 10065, USA
2. Department of Neurology, Weill Cornell Medicine, New York, NY 10065, USA
Abstract
Summary
We introduce a novel Bayesian estimator for the class proportion in an unlabeled dataset, based on the targeted learning framework. The procedure requires the specification of a prior (and outputs a posterior) only for the target of inference, and yields a tightly concentrated posterior. When the scientific question can be characterized by a low-dimensional parameter functional, this focus on target prior and posterior distributions perfectly aligns with Bayesian subjectivism. We prove a Bernstein–von Mises-type result for our proposed Bayesian procedure, which guarantees that the posterior distribution converges to the distribution of an efficient, asymptotically linear estimator. In particular, the posterior is Gaussian, doubly robust, and efficient in the limit, under the only assumption that certain nuisance parameters are estimated at slower-than-parametric rates. We perform numerical studies illustrating the frequentist properties of the method. We also illustrate their use in a motivating application to estimate the proportion of embolic strokes of undetermined source arising from occult cardiac sources or large-artery atherosclerotic lesions. Though we focus on the motivating example of the proportion of cases in an unlabeled dataset, the procedure is general and can be adapted to estimate any pathwise differentiable parameter in a non-parametric model.
Publisher
Oxford University Press (OUP)
Subject
Statistics, Probability and Uncertainty,General Medicine,Statistics and Probability
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献