Machine Learning Algorithms Applied to Predict Autism Spectrum Disorder Based on Gut Microbiome Composition

Author:

Olaguez-Gonzalez Juan M.1ORCID,Chairez Isaac12ORCID,Breton-Deval Luz34ORCID,Alfaro-Ponce Mariel12ORCID

Affiliation:

1. School of Engineering and Science, Tecnologico de Monterrey, Monterrey 64849, Mexico

2. Institute of Advanced Materials for Sustainable Manufacturing, Tecnologico de Monterrey, Monterrey 64849, Mexico

3. Instituto de Biotecnología, Universidad Nacional Autónoma de México, Cuernavaca 62210, Mexico

4. Consejo Nacional de Ciencia y Tecnologia, Mexico City 03940, Mexico

Abstract

The application of machine learning (ML) techniques stands as a reliable method for aiding in the diagnosis of complex diseases. Recent studies have related the composition of the gut microbiota to the presence of autism spectrum disorder (ASD), but until now, the results have been mostly contradictory. This work proposes using machine learning to study the gut microbiome composition and its role in the early diagnosis of ASD. We applied support vector machines (SVMs), artificial neural networks (ANNs), and random forest (RF) algorithms to classify subjects as neurotypical (NT) or having ASD, using published data on gut microbiome composition. Naive Bayes, k-nearest neighbors, ensemble learning, logistic regression, linear regression, and decision trees were also trained and validated; however, the ones presented showed the best performance and interpretability. All the ML methods were developed using the SAS Viya software platform. The microbiome’s composition was determined using 16S rRNA sequencing technology. The application of ML yielded a classification accuracy as high as 90%, with a sensitivity of 96.97% and specificity reaching 85.29%. In the case of the ANN model, no errors occurred when classifying NT subjects from the first dataset, indicating a significant classification outcome compared to traditional tests and data-based approaches. This approach was repeated with two datasets, one from the USA and the other from China, resulting in similar findings. The main predictors in the obtained models differ between the analyzed datasets. The most important predictors identified from the analyzed datasets are Bacteroides, Lachnospira, Anaerobutyricum, and Ruminococcus torques. Notably, among the predictors in each model, there is the presence of bacteria that are usually considered insignificant in the microbiome’s composition due to their low relative abundance. This outcome reinforces the conventional understanding of the microbiome’s influence on ASD development, where an imbalance in the composition of the microbiota can lead to disrupted host–microbiota homeostasis. Considering that several previous studies focused on the most abundant genera and neglected smaller (and frequently not statistically significant) microbial communities, the impact of such communities has been poorly analyzed. The ML-based models suggest that more research should focus on these less abundant microbes. A novel hypothesis explains the contradictory results in this field and advocates for more in-depth research to be conducted on variables that may not exhibit statistical significance. The obtained results seem to contribute to an explanation of the contradictory findings regarding ASD and its relation with gut microbiota composition. While some research correlates higher ratios of Bacillota/Bacteroidota, others find the opposite. These discrepancies are closely linked to the minority organisms in the microbiome’s composition, which may differ between populations but share similar metabolic functions. Therefore, the ratios of Bacillota/Bacteroidota regarding ASD may not be determinants in the manifestation of ASD.

Funder

Tecnológico de Monterrey, Institute of Advanced Materials for Sustainable Manufacturing

Publisher

MDPI AG

Subject

General Biochemistry, Genetics and Molecular Biology,Medicine (miscellaneous)

Reference79 articles.

1. Survey of Machine Learning Algorithms for Disease Diagnostic;Fatima;J. Intell. Learn. Syst. Appl.,2017

2. Improving the accuracy of medical diagnosis with causal machine learning;Richens;Nat. Commun.,2020

3. Stock, P., and Cissé, M. (2017). ConvNets and ImageNet Beyond Accuracy: Explanations, Bias Detection, Adversarial Examples and Model Criticism. arXiv.

4. Fu, S.C., Lee, C.H., and Wang, H. (2021). Exploring the Association of Autism Spectrum Disorders and Constipation through Analysis of the Gut Microbiome. Int. J. Environ. Res. Public Health, 18.

5. Machine learning methods for microbiome studies;Namkung;J. Microbiol.,2020

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3