Affiliation:
1. University of California at Los Angeles
2. Harvard University
Abstract
Methods for handling missing data in social science data sets are reviewed. Limitations of common practical approaches, including complete-case analysis, available-case analysis and imputation, are illustrated on a simple missing-data problem with one complete and one incomplete variable. Two more principled approaches, namely maximum likelihood under a model for the data and missing-data mechanism and multiple imputation, are applied to the bivariate problem. General properties of these methods are outlined, and applications to more complex missing-data problems are discussed. The EM algorithm, a convenient method for computing maximum likelihood estimates in missing-data problems, is described and applied to two common models, the multivariate normal model for continuous data and the multinomial model for discrete data. Multiple imputation under explicit or implicit models is recommended as a method that retains the advantages of imputation and overcomes its limitations.
Subject
Sociology and Political Science,Social Sciences (miscellaneous)
Cited by
754 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献