Highly accurate disease diagnosis and highly reproducible biomarker identification with PathFormer

Author:

Li Fuhai1ORCID,Dong Zehao2,Zhao Qihang3,Payne Philip4ORCID,Province Michael5ORCID,Cruchaga Carlos6ORCID,Zhang Muhan7,Zhao Tianyu2,Chen Yixin8

Affiliation:

1. Department of Pediatrics, Washington University School of Medicine, Washington University in St. Louis

2. Washington University in St. Louis

3. The Hong Kong Polytechnic University

4. Washington University School of Medicine in St. Louis

5. Washington University

6. Washington University School of Medicine

7. Peking University

8. Washington University in St Louis

Abstract

Abstract Biomarker identification is critical for precise disease diagnosis and understanding disease pathogenesis in omics data analysis, like using fold change and regression analysis. Graph neural networks (GNNs) have been the dominant deep learning model for analyzing graph-structured data. However, we found two major limitations of existing GNNs in omics data analysis, i.e., limited-prediction/diagnosis accuracy and limited-reproducible biomarker identification capacity across multiple datasets. The root of the challenges is the unique graph structure of biological signaling pathways, which consists of a large number of targets and intensive and complex signaling interactions among these targets. To resolve these two challenges, in this study, we presented a novel GNN model architecture, named PathFormer, which systematically integrate signaling network, priori knowledge and omics data to rank biomarkers and predict disease diagnosis. In the comparison results, PathFormer outperformed existing GNN models significantly in terms of highly accurate prediction capability (~ 30% accuracy improvement in disease diagnosis compared with existing GNN models) and high reproducibility of biomarker ranking across different datasets. The improvement was confirmed using two independent Alzheimer’s Disease (AD) and cancer transcriptomic datasets. The PathFormer model can be directly applied to other omics data analysis studies.

Publisher

Research Square Platform LLC

Reference68 articles.

1. Andrew L Hopkins. 2008. Network pharmacology: the next paradigm in drug discovery. Nature chemical biology 4, 11 (2008), 682–690.

2. Scott H Podolsky and Jeremy A Greene. 2011. Combination drugs—hype, harm, and hope. New England Journal of Medicine 365, 6 (2011), 488–491

3. "Cancer classification using gene expression data;Lu Ying;Information Systems,2003

4. The current state of breast cancer classification;Viale Giuseppe;Annals of oncology,2012

5. Amrane, Meriem, et al. "Breast cancer classification using machine learning." 2018 electric electronics, computer science, biomedical engineerings' meeting (EBBT). IEEE, 2018

Cited by 2 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3