A Highly Accurate Ensemble Classifier for the Molecular Diagnosis of ASD at Ages 1 to 4 Years

Author:

Bao BokanORCID,Gazestani Vahid H.ORCID,Xiao YaqiongORCID,Kim Raphael,Chiang Austin W.T.,Nalabolu Srinivasa,Pierce Karen,Robasky KimberlyORCID,Lewis Nathan E.ORCID,Courchesne EricORCID

Abstract

ABSTRACTImportanceASD diagnosis remains behavior-based and the median age of the first diagnosis remains unchanged at ∼52 months, which is nearly 5 years after its first trimester origin. Long delays between ASD’s prenatal onset and eventual diagnosis likely is a missed opportunity. However, accurate and clinically-translatable early-age diagnostic methods do not exist due to ASD genetic and clinical heterogeneity. There is a need for early-age diagnostic biomarkers of ASD that is robust against its heterogeneity.ObjectiveTo develop a single blood-based molecular classifier that accurately diagnoses ASD at the age of first symptoms.Design, Setting, and ParticipantsN=264 ASD, typically developing (TD), and language delayed (LD) toddlers with their clinical, diagnostic, and leukocyte RNA data collected. Datasets included Discovery (n=175 ASD, TD subjects), Longitudinal (n=33 ASD, TD subjects), and Replication (n=89 ASD, TD, LD subjects). We developed an ensemble of ASD classifiers by testing 42,840 models composed of 3,570 feature selection sets and 12 classification methods. Models were trained on the Discovery dataset with 5-fold cross validation. Results were used to construct a Bayesian model averaging-based (BMA) ensemble classifier model that was tested in Discovery and Replication datasets. Data were collected from 2007 to 2012 and analyzed from August 2019 to April 2021.Main Outcomes and MeasuresPrimary outcomes were (1) comparisons of the performance of 42,840 classifier models in correctly identifying ASD vs TD and LD in Discovery and Replication datasets; and (2) performance of the ensemble model composed of 1,076 models and weighted by Bayesian model averaging technique.ResultsOf 42,840 models trained in the Discovery dataset, 1,076 averaged AUC-ROC>0.8. These 1,076 models used 191 different feature routes and 2,764 gene features. Using weighted BMA of these features and routes, an ensemble classifier model was constructed which demonstrated excellent performance in Discovery and Replication datasets with ASD classification AUC-ROC scores of 84% to 88%. ASD classification accuracy was comparable against LD and TD subjects and in the Longitudinal dataset. ASD toddlers with ensemble scores above and below the ASD ensemble mean had similar diagnostic and psychometric scores, but those below the ASD ensemble mean had more prenatal risk events than TD toddlers. Ensemble features include genes with immune/inflammation, response to cytokines, transcriptional regulation, mitotic cell cycle, and PI3K-AKT, RAS, and Wnt signaling pathways.Conclusions and RelevanceAn ensemble ASD molecular classifier has high and replicable accuracy across the spectrum of ASD clinical characteristics and across toddlers aged 1 to 4 years, which has potential for clinical translation.Key PointsQuestionSince ASD is genetically and clinical heterogeneous, can a single blood-based molecular classifier accurately diagnose ASD at the age of first symptoms?FindingsTo address heterogeneity, we developed an ASD classifier method testing 42,840 models. An ensemble of 1,076 models using 191 different feature routes and 2,764 gene features, weighted by Bayesian model averaging, demonstrated excellent performance in Discovery and Replication datasets producing ASD classification with the area under the receiver operating characteristic curve (AUC-ROC) scores of 84% to 88%. Features include genes with immune/inflammation, response to cytokines, transcriptional regulation, mitotic cell cycle, and PI3K-AKT, RAS and Wnt signaling pathways.MeaningAn ensemble gene expression ASD classifier has high accuracy across the spectrum of ASD clinical characteristics and across toddlers aged 1 to 4 years.

Publisher

Cold Spring Harbor Laboratory

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3