A Machine Learning-Based Diagnostic Model for Crohn’s Disease and Ulcerative Colitis Utilizing Fecal Microbiome Analysis

Author:

Kim Hyeonwoo1ORCID,Na Ji Eun2,Kim Sangsoo1ORCID,Kim Tae-Oh2,Park Soo-Kyung34,Lee Chil-Woo4ORCID,Kim Kyeong Ok5,Seo Geom-Seog6,Kim Min Suk7ORCID,Cha Jae Myung8,Koo Ja Seol9ORCID,Park Dong-Il34

Affiliation:

1. Department of Bioinformatics, Soongsil University, Seoul 06978, Republic of Korea

2. Department of Internal Medicine, College of Medicine, Inje University Haeundae Paik Hospital, Busan 48108, Republic of Korea

3. Division of Gastroenterology, Department of Internal Medicine and Inflammatory Bowel Disease Center, Kangbuk Samsung Hospital, School of Medicine, Sungkyunkwan University, Seoul 03181, Republic of Korea

4. Medical Research Institute, Kangbuk Samsung Hospital, School of Medicine, Sungkyunkwan University, Seoul 03181, Republic of Korea

5. Department of Internal Medicine, College of Medicine, Yeungnam University, Daegu 42415, Republic of Korea

6. Department of Internal Medicine, School of Medicine, Wonkwang University, Iksan 54538, Republic of Korea

7. Department of Human Intelligence and Robot Engineering, Sangmyung University, Cheonan-si 31066, Republic of Korea

8. Department of Internal Medicine, Kyung Hee University Hospital at Gangdong, Kyung Hee University College of Medicine, Seoul 05278, Republic of Korea

9. Division of Gastroenterology and Hepatology, Department of Internal Medicine, Ansan Hospital, Korea University College of Medicine, Ansan 15355, Republic of Korea

Abstract

Recent research has demonstrated the potential of fecal microbiome analysis using machine learning (ML) in the diagnosis of inflammatory bowel disease (IBD), mainly Crohn’s disease (CD) and ulcerative colitis (UC). This study employed the sparse partial least squares discriminant analysis (sPLS-DA) ML technique to develop a robust prediction model for distinguishing among CD, UC, and healthy controls (HCs) based on fecal microbiome data. Using data from multicenter cohorts, we conducted 16S rRNA gene sequencing of fecal samples from patients with CD (n = 671) and UC (n = 114) while forming an HC cohort of 1462 individuals from the Kangbuk Samsung Hospital Healthcare Screening Center. A streamlined pipeline based on HmmUFOTU was used. After a series of filtering steps, 1517 phylotypes and 1846 samples were retained for subsequent analysis. After 100 rounds of downsampling with age, sex, and sample size matching, and division into training and test sets, we constructed two binary prediction models to distinguish between IBD and HC and CD and UC using the training set. The binary prediction models exhibited high accuracy and area under the curve (for differentiating IBD from HC (mean accuracy, 0.950; AUC, 0.992) and CD from UC (mean accuracy, 0.945; AUC, 0.988)), respectively, in the test set. This study underscores the diagnostic potential of an ML model based on sPLS-DA, utilizing fecal microbiome analysis, highlighting its ability to differentiate between IBD and HC and distinguish CD from UC.

Funder

National Research Foundation

Korea Health Industry Development Institute 375

Publisher

MDPI AG

Subject

Virology,Microbiology (medical),Microbiology

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3