Author:
Chen Yanjing,Zhao Wei,Yi Sijie,Liu Jun
Abstract
ObjectiveMachine learning (ML) has been widely used to detect and evaluate major depressive disorder (MDD) using neuroimaging data, i.e., resting-state functional magnetic resonance imaging (rs-fMRI). However, the diagnostic efficiency is unknown. The aim of the study is to conduct an updated meta-analysis to evaluate the diagnostic performance of ML based on rs-fMRI data for MDD.MethodsEnglish databases were searched for relevant studies. The Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) was used to assess the methodological quality of the included studies. A random-effects meta-analytic model was implemented to investigate the diagnostic efficiency, including sensitivity, specificity, diagnostic odds ratio (DOR), and area under the curve (AUC). Regression meta-analysis and subgroup analysis were performed to investigate the cause of heterogeneity.ResultsThirty-one studies were included in this meta-analysis. The pooled sensitivity, specificity, DOR, and AUC with 95% confidence intervals were 0.80 (0.75, 0.83), 0.83 (0.74, 0.82), 14.00 (9, 22.00), and 0.86 (0.83, 0.89), respectively. Substantial heterogeneity was observed among the studies included. The meta-regression showed that the leave-one-out cross-validation (loocv) (sensitivity: p < 0.01, specificity: p < 0.001), graph theory (sensitivity: p < 0.05, specificity: p < 0.01), n > 100 (sensitivity: p < 0.001, specificity: p < 0.001), simens equipment (sensitivity: p < 0.01, specificity: p < 0.001), 3.0T field strength (Sensitivity: p < 0.001, specificity: p = 0.04), and Beck Depression Inventory (BDI) (sensitivity: p = 0.04, specificity: p = 0.06) might be the sources of heterogeneity. Furthermore, the subgroup analysis showed that the sample size (n > 100: sensitivity: 0.71, specificity: 0.72, n < 100: sensitivity: 0.81, specificity: 0.79), the different levels of disease evaluated by the Hamilton Depression Rating Scale (HDRS/HAMD) (mild vs. moderate vs. severe: sensitivity: 0.52 vs. 0.86 vs. 0.89, specificity: 0.62 vs. 0.78 vs. 0.82, respectively), the depression scales in patients with comparable levels of severity. (BDI vs. HDRS/HAMD: sensitivity: 0.86 vs. 0.87, specificity: 0.78 vs. 0.80, respectively), and the features (graph vs. functional connectivity: sensitivity: 0.84 vs. 0.86, specificity: 0.76 vs. 0.78, respectively) selected might be the causes of heterogeneity.ConclusionML showed high accuracy for the automatic diagnosis of MDD. Future studies are warranted to promote the potential use of these classification algorithms in clinical settings.