A Comparison of Imputation Methods for Missing Risk Factor Data from Large Real-world Electronic Medical Records for Comparative Effectiveness Studies (Preprint)
Author:
Abstract
Background: Evaluation of appropriate methodologies for imputation of missing risk factor or outcome data from electronic medical records (EMRs) is crucial but lacking for comparative effectiveness studies. Robust imputation of missing data relies on the understanding of the predictors of missingness in the risk factor data, especially in patients with chronic diseases. These two aspects have not been explored simultaneously to support methodological developments in clinical epidemiological studies with real-world data. Methods: Using disease-biomarker data (glycated haemoglobin, HbA1c) from large EMR database in patients with diabetes, exploratory analyses were conducted to ascertain the possible predictors of missingness. Three approaches based on multiple imputation (MI) technique, namely two-fold MI, MI by chained equations, and MI with Monte Carlo Markov Chain, were evaluated in terms of their robustness in imputing missing data. The value of using imputed data for drawing robust inferences on comparative effectiveness of two anti-diabetes therapies were compared with the complete-case analyses. Results: Older patients and patients with higher disease-severity were less likely to have missing HbA1c data longitudinally over 12 months, while gender and pre-existing comorbidities were not associated with the likelihood of missingness. No significant differences in the distributions of follow-up imputed data with the three methods were observed. Conclusion: While complete case analyses were prone to bias by indication, use of three MI techniques for large proportion of missing primary outcome data under unknown patterns of missingness appeared to be valid, and able to provide consistent and reliable clinical inferences.
Publisher
JMIR Publications Inc.
Reference20 articles.
1. The utility of general purpose versus specialty clinical databases for research: Warfarin dose estimation from extracted clinical variables
2. A review of approaches to identifying patient phenotype cohorts using electronic health records
3. Delay in treatment intensification increases the risks of cardiovascular events in patients with type 2 diabetes
4. The Association between Body Mass Index and Mortality in Incident Dialysis Patients
5. Obesity paradox in people newly diagnosed with type 2 diabetes with and without prior cardiovascular disease
1.学者识别学者识别
2.学术分析学术分析
3.人才评估人才评估
"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370
www.globalauthorid.com
TOP
Copyright © 2019-2024 北京同舟云网络信息技术有限公司 京公网安备11010802033243号 京ICP备18003416号-3