Author:
Campos Sergio,Zamora Juan,Allende Héctor,
Abstract
AbstractAlzheimer’s disease is the most common form of dementia and the early detection is essential to prevent its proliferation. Real data available has been of paramount importance in order to achieve progress in the automatic detection despite presenting two major challenges: Multi-source observations containing Magnetic resonance (MRI), Positron emission tomography (PET) and Cerebrospinal fluid data (CSF); and also missing values within all these sources. Most machine learning techniques perform this predictive task by using a single data modality. Nevertheless, the integration of all these sources of evidence could possibly bring a higher performance at different stages of disease progression. The Expectation Maximization (EM) algorithm has been successfully employed to handle missing values, but it is not designed for typical Machine Learning scenarios where an imputation model is created over training data and subsequently applied on a testing set. In this work, we propose EMreg-KNN, a novel supervised and multi-source imputation algorithm. Based on the EM algorithm, EMreg-KNN builds a regression ensemble model for the imputation of future data thus allowing the further utilization of any vector-based Machine Learning method to automatically assess the Alzheimer’s disease diagnosis. Using the ADNI database, the proposed method achieves significant improvements on F1, AUC and Accuracy measures over classical imputation methods for this database using four classification algorithms. Considering these classifiers in four different classification scenarios, our algorithm is experimentally superior in terms of the F measure, in nearly 82% of the cases under evaluation.
Funder
Agencia Nacional de Investigación y Desarrollo
DGIIP-UTFSM
Publisher
Springer Science and Business Media LLC