Author:
Zhang Qi,Coury Ron,Tang Wenlong
Abstract
Abstract
Background
Due to the heterogeneity among patients with Mild Cognitive Impairment (MCI), it is critical to predict their risk of converting to Alzheimer’s disease (AD) early using routinely collected real-world data such as the electronic health record data or administrative claim data.
Methods
The study used MarketScan Multi-State Medicaid data to construct a cohort of MCI patients. Logistic regression with tree-guided lasso regularization (TGL) was proposed to select important features and predict the risk of converting to AD. A subsampling-based technique was used to extract robust groups of predictive features. Predictive models including logistic regression, generalized random forest, and artificial neural network were trained using the extracted features.
Results
The proposed TGL workflow selected feature groups that were robust, highly interpretable, and consistent with existing literature. The predictive models using TGL selected features demonstrated higher prediction accuracy than the models using all features or features selected using other methods.
Conclusions
The identified feature groups provide insights into the progression from MCI to AD and can potentially improve risk prediction in clinical practice and trial recruitment.
Publisher
Springer Science and Business Media LLC