Integration of protein context improves protein-based COVID-19 patient stratification

Author:

Gao Jinlong,He Jiale,Zhang Fangfei,Xiao Qi,Cai Xue,Yi Xiao,Zheng Siqi,Zhang Ying,Wang Donglian,Zhu Guangjun,Wang Jing,Shen Bo,Ralser Markus,Guo Tiannan,Zhu Yi

Abstract

Abstract Background Classification of disease severity is crucial for the management of COVID-19. Several studies have shown that individual proteins can be used to classify the severity of COVID-19. Here, we aimed to investigate whether integrating four types of protein context data, namely, protein complexes, stoichiometric ratios, pathways and network degrees will improve the severity classification of COVID-19. Methods We performed machine learning based on three previously published datasets. The first was a SWATH (sequential window acquisition of all theoretical fragment ion spectra) MS (mass spectrometry) based proteomic dataset. The second was a TMTpro 16plex labeled shotgun proteomics dataset. The third was a SWATH dataset of an independent patient cohort. Results Besides twelve proteins, machine learning also prioritized two complexes, one stoichiometric ratio, five pathways, and five network degrees, resulting a 25-feature panel. As a result, a model based on the 25 features led to effective classification of severe cases with an AUC of 0.965, outperforming the models with proteins only. Complement component C9, transthyretin (TTR) and TTR-RBP (transthyretin-retinol binding protein) complex, the stoichiometric ratio of SAA2 (serum amyloid A proteins 2)/YLPM1 (YLP Motif Containing 1), and the network degree of SIRT7 (Sirtuin 7) and A2M (alpha-2-macroglobulin) were highlighted as potential markers by this classifier. This classifier was further validated with a TMT-based proteomic data set from the same cohort (test dataset 1) and an independent SWATH-based proteomic data set from Germany (test dataset 2), reaching an AUC of 0.900 and 0.908, respectively. Machine learning models integrating protein context information achieved higher AUCs than models with only one feature type. Conclusion Our results show that the integration of protein context including protein complexes, stoichiometric ratios, pathways, network degrees, and proteins improves phenotype prediction.

Funder

National Key R&D Program of China

National Science Fund for Young Scholars

the National Natural Science Foundation of China

Zhejiang Provincial Natural Science Foundation for Distinguished Young Scholars

Publisher

Springer Science and Business Media LLC

Subject

Clinical Biochemistry,Molecular Biology,Molecular Medicine,Clinical Biochemistry,Molecular Biology,Molecular Medicine

Reference51 articles.

1. Hui DS, Esam IA, Madani TA, Ntoumi F, Kock R, Dar O, Ippolito G, McHugh TD, Memish ZA, Drosten C, Zumla A, Petersen E. The continuing 2019-nCoV epidemic threat of novel coronaviruses to global health—the latest 2019 novel coronavirus outbreak in Wuhan, China. Int J Infect Dis. 2020;91:264–6.

2. Gao YD, Ding M, Dong X, Zhang JJ, Kursat Azkur A, Azkur D, Gan H, Sun YL, Fu W, Li W, Liang HL, Cao YY, Yan Q, Cao C, Gao HY, Bruggen MC, van de Veen W, Sokolowska M, Akdis M, Akdis CA. Risk factors for severe and critically ill COVID-19 patients: a review. Allergy. 2021;76(2):428–55.

3. Shen B, Yi X, Sun Y, Bi X, Du J, Zhang C, Quan S, Zhang F, Sun R, Qian L, Ge W, Liu W, Liang S, Chen H, Zhang Y, Li J, Xu J, He Z, Chen B, Wang J, Yan H, Zheng Y, Wang D, Zhu J, Kong Z, Kang Z, Liang X, Ding X, Ruan G, Xiang N, Cai X, Gao H, Li L, Li S, Xiao Q, Lu T, Zhu Y, Liu H, Chen H, Guo T. Proteomic and metabolomic characterization of COVID-19 patient sera. Cell. 2020. https://doi.org/10.1016/j.cell.2020.05.032.

4. Messner CB, Demichev V, Wendisch D, Michalick L, White M, Freiwald A, Textoris-Taube K, Vernardis SI, Egger AS, Kreidl M, Ludwig D, Kilian C, Agostini F, Zelezniak A, Thibeault C, Pfeiffer M, Hippenstiel S, Hocke A, von Kalle C, Campbell A, Hayward C, Porteous DJ, Marioni RE, Langenberg C, Lilley KS, Kuebler WM, Mulleder M, Drosten C, Suttorp N, Witzenrath M, Kurth F, Sander LE, Ralser M. Ultra-high-throughput clinical proteomics reveals classifiers of COVID-19 infection. Cell Syst. 2020;11(1):11-24 e4.

5. Gutmann C, Takov K, Burnap SA, Singh B, Ali H, Theofilatos K, Reed E, Hasman M, Nabeebaccus A, Fish M, McPhail MJ, O’Gallagher K, Schmidt LE, Cassel C, Rienks M, Yin X, Auzinger G, Napoli S, Mujib SF, Trovato F, Sanderson B, Merrick B, Niazi U, Saqi M, Dimitrakopoulou K, Fernandez-Leiro R, Braun S, Kronstein-Wiedemann R, Doores KJ, Edgeworth JD, Shah AM, Bornstein SR, Tonn T, Hayday AC, Giacca M, Shankar-Hari M, Mayr M. SARS-CoV-2 RNAemia and proteomic trajectories inform prognostication in COVID-19 patients admitted to intensive care. Nat Commun. 2021;12(1):3406.

Cited by 3 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3