Comparison of Two Approaches for Handling Missing Covariates in Logistic Regression-Reference-Cited by-同舟云学术

Comparison of Two Approaches for Handling Missing Covariates in Logistic Regression

Published:2007-09-19 Issue:1 Volume:68 Page:58-77
ISSN:0013-1644
Container-title:Educational and Psychological Measurement
language:en
Short-container-title:Educational and Psychological Measurement

Author:

Peng Chao-Ying Joanne¹,Jin Zhu ²

Affiliation:

1. Indiana University, Bloomington,

2. Genentech, Inc.

Abstract

For the past 25 years, methodological advances have been made in missing data treatment. Most published work has focused on missing data in dependent variables under various conditions. The present study seeks to fill the void by comparing two approaches for handling missing data in categorical covariates in logistic regression: the expectation-maximization (EM) method of weights and multiple imputation (MI). Sample data are drawn randomly from a population with known characteristics. Missing data on covariates are simulated under two conditions: missing completely at random and missing at random with different missing rates. A logistic regression model was fit to each sample using either the EM or MI approach. The performance of these two approaches is compared on four criteria: bias, efficiency, coverage, and rejection rate. Results generally favored MI over EM. Practical issues such as implementation, inclusion of continuous covariates, and interactions between covariates are discussed.

Publisher

SAGE Publications

Subject

Applied Mathematics,Applied Psychology,Developmental and Educational Psychology,Education

Link

http://journals.sagepub.com/doi/pdf/10.1177/0013164407305582

Reference24 articles.

1. ML– and semiparametric estimation in logistic models with incomplete covariate data

2. Hidden Bias in the Use of Archival Data

Cited by 21 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Bayesian graph convolutional network with partial observations;PLOS ONE;2024-07-18

2. Changes in Online Sexual Activities During the Lockdown Caused by COVID-19 in Spain: “INSIDE” Project;Sexuality Research and Social Policy;2024-06-18

3. An evaluation of methods to handle missing data in the context of latent variable interaction analysis: multiple imputation, maximum likelihood, and random forest algorithm;Japanese Journal of Statistics and Data Science;2022-08-11

4. Efficient Utilization of Missing Data in Cost-Sensitive Learning;IEEE Transactions on Knowledge and Data Engineering;2021-06-01

5. The Relationship Between Physical Activity and School Success Among Children With and Without Special Health Care Needs;Journal of School Health;2021-03-25