Affiliation:
1. Stanford University Department of Statistics, , 390 Jane Stanford Way, Stanford, California 94305, U.S.A
Abstract
Summary
Parameters of subpopulations can be more relevant than those of superpopulations. For example, a healthcare provider may be interested in the effect of a treatment plan for a specific subset of their patients; policymakers may be concerned with the impact of a policy in a particular state within a given population. In these cases, the focus is on a specific finite population, as opposed to an infinite superpopulation. Such a population can be characterized by fixing some attributes that are intrinsic to them, leaving unexplained variations like measurement error as random. Inference for a population with fixed attributes can then be modelled as inferring parameters of a conditional distribution. Accordingly, it is desirable that confidence intervals are conditionally valid for the realized population, instead of marginalizing over many possible draws of populations. We provide a statistical inference framework for parameters of finite populations with known attributes. Leveraging the attribute information, our estimators and confidence intervals closely target a specific finite population. When the data are from the population of interest, our confidence intervals attain asymptotic conditional validity, given the attributes, and are shorter than those for superpopulation inference. In addition, we develop procedures to infer parameters of new populations with differing covariate distributions; the confidence intervals are also conditionally valid for the new populations under mild conditions. Our methods extend to situations where the fixed information has a weaker structure or is only partially observed. We demonstrate the validity and applicability of our methods using simulated data and a real-word dataset for predicting car prices.
Publisher
Oxford University Press (OUP)
Subject
Applied Mathematics,Statistics, Probability and Uncertainty,General Agricultural and Biological Sciences,Agricultural and Biological Sciences (miscellaneous),General Mathematics,Statistics and Probability
Reference35 articles.
1. Sampling-based versus design-based uncertainty in regression analysis;Abadie,;Econometrica,2020
2. Inference for misspecified models with fixed regressors;Abadie,;J. Am. Statist. Assoc.,2014
3. Inference for linear conditional moment inequalities;Andrews,,2022
4. Estimating the labor market impact of voluntary military service using social security data on military applicants;Angrist,;Econometrica,1995
5. Comparing experimental and matching methods using a large-scale voter mobilization experiment;Arceneaux,;Polit. Anal.,2006
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Cross-prediction-powered inference;Proceedings of the National Academy of Sciences;2024-04-03