Author:
Shishegar Rosita,Cox Timothy,Rolls David,Bourgeat Pierrick,Doré Vincent,Lamb Fiona,Robertson Joanne,Laws Simon M.,Porter Tenielle,Fripp Jurgen,Tosun Duygu,Maruff Paul,Savage Greg,Rowe Christopher C.,Masters Colin L.,Weiner Michael W.,Villemagne Victor L.,Burnham Samantha C.
Abstract
AbstractTo improve understanding of Alzheimer’s disease, large observational studies are needed to increase power for more nuanced analyses. Combining data across existing observational studies represents one solution. However, the disparity of such datasets makes this a non-trivial task. Here, a machine learning approach was applied to impute longitudinal neuropsychological test scores across two observational studies, namely the Australian Imaging, Biomarkers and Lifestyle Study (AIBL) and the Alzheimer's Disease Neuroimaging Initiative (ADNI) providing an overall harmonised dataset. MissForest, a machine learning algorithm, capitalises on the underlying structure and relationships of data to impute test scores not measured in one study aligning it to the other study. Results demonstrated that simulated missing values from one dataset could be accurately imputed, and that imputation of actual missing data in one dataset showed comparable discrimination (p < 0.001) for clinical classification to measured data in the other dataset. Further, the increased power of the overall harmonised dataset was demonstrated by observing a significant association between CVLT-II test scores (imputed for ADNI) with PET Amyloid-β in MCI APOE-ε4 homozygotes in the imputed data (N = 65) but not for the original AIBL dataset (N = 11). These results suggest that MissForest can provide a practical solution for data harmonization using imputation across studies to improve power for more nuanced analyses.
Funder
National Institute on Aging
Commonwealth Scientific and Industrial Research Organisation
Publisher
Springer Science and Business Media LLC
Cited by
21 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献