Integration of protein context improves protein-based COVID-19 patient stratification
-
Published:2022-08-11
Issue:1
Volume:19
Page:
-
ISSN:1542-6416
-
Container-title:Clinical Proteomics
-
language:en
-
Short-container-title:Clin Proteom
Author:
Gao Jinlong,He Jiale,Zhang Fangfei,Xiao Qi,Cai Xue,Yi Xiao,Zheng Siqi,Zhang Ying,Wang Donglian,Zhu Guangjun,Wang Jing,Shen Bo,Ralser Markus,Guo Tiannan,Zhu Yi
Abstract
Abstract
Background
Classification of disease severity is crucial for the management of COVID-19. Several studies have shown that individual proteins can be used to classify the severity of COVID-19. Here, we aimed to investigate whether integrating four types of protein context data, namely, protein complexes, stoichiometric ratios, pathways and network degrees will improve the severity classification of COVID-19.
Methods
We performed machine learning based on three previously published datasets. The first was a SWATH (sequential window acquisition of all theoretical fragment ion spectra) MS (mass spectrometry) based proteomic dataset. The second was a TMTpro 16plex labeled shotgun proteomics dataset. The third was a SWATH dataset of an independent patient cohort.
Results
Besides twelve proteins, machine learning also prioritized two complexes, one stoichiometric ratio, five pathways, and five network degrees, resulting a 25-feature panel. As a result, a model based on the 25 features led to effective classification of severe cases with an AUC of 0.965, outperforming the models with proteins only. Complement component C9, transthyretin (TTR) and TTR-RBP (transthyretin-retinol binding protein) complex, the stoichiometric ratio of SAA2 (serum amyloid A proteins 2)/YLPM1 (YLP Motif Containing 1), and the network degree of SIRT7 (Sirtuin 7) and A2M (alpha-2-macroglobulin) were highlighted as potential markers by this classifier. This classifier was further validated with a TMT-based proteomic data set from the same cohort (test dataset 1) and an independent SWATH-based proteomic data set from Germany (test dataset 2), reaching an AUC of 0.900 and 0.908, respectively. Machine learning models integrating protein context information achieved higher AUCs than models with only one feature type.
Conclusion
Our results show that the integration of protein context including protein complexes, stoichiometric ratios, pathways, network degrees, and proteins improves phenotype prediction.
Funder
National Key R&D Program of China National Science Fund for Young Scholars the National Natural Science Foundation of China Zhejiang Provincial Natural Science Foundation for Distinguished Young Scholars
Publisher
Springer Science and Business Media LLC
Subject
Clinical Biochemistry,Molecular Biology,Molecular Medicine,Clinical Biochemistry,Molecular Biology,Molecular Medicine
Reference51 articles.
1. Hui DS, Esam IA, Madani TA, Ntoumi F, Kock R, Dar O, Ippolito G, McHugh TD, Memish ZA, Drosten C, Zumla A, Petersen E. The continuing 2019-nCoV epidemic threat of novel coronaviruses to global health—the latest 2019 novel coronavirus outbreak in Wuhan, China. Int J Infect Dis. 2020;91:264–6. 2. Gao YD, Ding M, Dong X, Zhang JJ, Kursat Azkur A, Azkur D, Gan H, Sun YL, Fu W, Li W, Liang HL, Cao YY, Yan Q, Cao C, Gao HY, Bruggen MC, van de Veen W, Sokolowska M, Akdis M, Akdis CA. Risk factors for severe and critically ill COVID-19 patients: a review. Allergy. 2021;76(2):428–55. 3. Shen B, Yi X, Sun Y, Bi X, Du J, Zhang C, Quan S, Zhang F, Sun R, Qian L, Ge W, Liu W, Liang S, Chen H, Zhang Y, Li J, Xu J, He Z, Chen B, Wang J, Yan H, Zheng Y, Wang D, Zhu J, Kong Z, Kang Z, Liang X, Ding X, Ruan G, Xiang N, Cai X, Gao H, Li L, Li S, Xiao Q, Lu T, Zhu Y, Liu H, Chen H, Guo T. Proteomic and metabolomic characterization of COVID-19 patient sera. Cell. 2020. https://doi.org/10.1016/j.cell.2020.05.032. 4. Messner CB, Demichev V, Wendisch D, Michalick L, White M, Freiwald A, Textoris-Taube K, Vernardis SI, Egger AS, Kreidl M, Ludwig D, Kilian C, Agostini F, Zelezniak A, Thibeault C, Pfeiffer M, Hippenstiel S, Hocke A, von Kalle C, Campbell A, Hayward C, Porteous DJ, Marioni RE, Langenberg C, Lilley KS, Kuebler WM, Mulleder M, Drosten C, Suttorp N, Witzenrath M, Kurth F, Sander LE, Ralser M. Ultra-high-throughput clinical proteomics reveals classifiers of COVID-19 infection. Cell Syst. 2020;11(1):11-24 e4. 5. Gutmann C, Takov K, Burnap SA, Singh B, Ali H, Theofilatos K, Reed E, Hasman M, Nabeebaccus A, Fish M, McPhail MJ, O’Gallagher K, Schmidt LE, Cassel C, Rienks M, Yin X, Auzinger G, Napoli S, Mujib SF, Trovato F, Sanderson B, Merrick B, Niazi U, Saqi M, Dimitrakopoulou K, Fernandez-Leiro R, Braun S, Kronstein-Wiedemann R, Doores KJ, Edgeworth JD, Shah AM, Bornstein SR, Tonn T, Hayday AC, Giacca M, Shankar-Hari M, Mayr M. SARS-CoV-2 RNAemia and proteomic trajectories inform prognostication in COVID-19 patients admitted to intensive care. Nat Commun. 2021;12(1):3406.
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|