Abstract
Introduction:
Clinical notes, biomarkers, and neuroimaging have been proven valuable in dementia prediction models. Whether commonly available structured clinical data can predict dementia is an emerging area of research. We aimed to predict Alzheimer’s disease (AD) and Alzheimer’s disease related dementias (ADRD) in a well-phenotyped, population-based cohort using a machine learning approach.
Methods
Administrative healthcare data (k = 163 diagnostic features), in addition to Census/vital record sociodemographic data (k = 6 features), were linked to the Cache County Study (CCS, 1995–2008).
Results
Among successfully linked UPDB-CCS participants (n = 4206), 522 (12.4%) had incident AD/ADRD as per the CCS “gold standard” assessments. Random Forest models, with a 1-year prediction window, achieved the best performance with an Area Under the Curve (AUC) of 0.67. Accuracy declined for dementia subtypes: AD/ADRD (AUC = 0.65); ADRD (AUC = 0.49).
DISCUSSION
Commonly available structured clinical data (without labs, notes, or prescription information) demonstrate modest ability to predict AD/ADRD, corroborated by prior research.