Identify gestational diabetes mellitus by deep learning model from cell-free DNA at the early gestation stage

Author:

Wang Yipeng1ORCID,Sun Pei2,Zhao Zicheng34,Yan Yousheng1,Yue Wentao1,Yang Kai1,Liu Ruixia1,Huang Hui5,Wang Yinan6,Chen Yin3,Li Nan5,Feng Hailong2,Li Jing3,Liu Yifan1,Chen Yujiao1,Shen Bairong78ORCID,Zhao Lijian5,Yin Chenghong1

Affiliation:

1. Beijing Obstetrics and Gynecology Hospital, Capital Medical University, Beijing Maternal and Child Health Care Hospital , Beijing 100026 , P. R. China

2. BGI-Beijing Clinical Laboratories , BGI-Shenzhen, Beijing 101300 , P. R. China

3. Shenzhen Byoryn Technology Co., Ltd. , Shenzhen 518118 , P. R. China

4. Shanxi Keda Research Institute , Taiyuan 030000 , P. R. China

5. BGI Genomics, BGI-Shenzhen , Shenzhen 518083 , P. R. China

6. Department of Obstetrics and Gynecology, Peking University Shenzhen Hospital , Shenzhen 518055 , P. R. China

7. Institutes for Systems Genetics , Frontiers Science Center for Disease-related Molecular Network, West China Hospital, , Sichuan, 610041 , P. R. China

8. Sichuan University , Frontiers Science Center for Disease-related Molecular Network, West China Hospital, , Sichuan, 610041 , P. R. China

Abstract

Abstract Gestational diabetes mellitus (GDM) is a common complication of pregnancy, which has significant adverse effects on both the mother and fetus. The incidence of GDM is increasing globally, and early diagnosis is critical for timely treatment and reducing the risk of poor pregnancy outcomes. GDM is usually diagnosed and detected after 24 weeks of gestation, while complications due to GDM can occur much earlier. Copy number variations (CNVs) can be a possible biomarker for GDM diagnosis and screening in the early gestation stage. In this study, we proposed a machine-learning method to screen GDM in the early stage of gestation using cell-free DNA (cfDNA) sequencing data from maternal plasma. Five thousand and eighty-five patients from north regions of Mainland China, including 1942 GDM, were recruited. A non-overlapping sliding window method was applied for CNV coverage screening on low-coverage (~0.2×) sequencing data. The CNV coverage was fed to a convolutional neural network with attention architecture for the binary classification. The model achieved a classification accuracy of 88.14%, precision of 84.07%, recall of 93.04%, F1-score of 88.33% and AUC of 96.49%. The model identified 2190 genes associated with GDM, including DEFA1, DEFA3 and DEFB1. The enriched gene ontology (GO) terms and KEGG pathways showed that many identified genes are associated with diabetes-related pathways. Our study demonstrates the feasibility of using cfDNA sequencing data and machine-learning methods for early diagnosis of GDM, which may aid in early intervention and prevention of adverse pregnancy outcomes.

Funder

National Key Research and Development Program of China

Publisher

Oxford University Press (OUP)

Subject

Molecular Biology,Information Systems

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3