Abstract
AbstractUnderstanding the COVID-19 severity and why it differs significantly among patients is a thing of concern to the scientific community. The major contribution of this study arises from the use of a voting ensemble host genetic severity predictor (HGSP) model we developed by combining several state-of-the-art machine learning algorithms (decision tree-based models: Random Forest and XGBoost classifiers). These models were trained using a genetic Whole Exome Sequencing (WES) dataset and clinical covariates (age and gender) formulated from a 5-fold stratified cross-validation computational strategy to randomly split the dataset to overcome model instability. Our study validated the HGSP model based on the 18 features (i.e., 16 identified candidate genetic variants and 2 covariates) identified from a prior study. We provided post-hoc model explanations through the ExplainerDashboard - an open-source python library framework, allowing for deeper insight into the prediction results. We applied the Enrichr and OpenTarget genetics bioinformatic interactive tools to associate the genetic variants for plausible biological insights, and domain interpretations such as pathways, ontologies, and disease/drugs. Through an unsupervised clustering of the SHAP feature importance values, we visualized the complex genetic mechanisms. Our findings show that while age and gender mainly influence COVID-19 severity, a specific group of patients experiences severity due to complex genetic interactions.
Publisher
Cold Spring Harbor Laboratory
Reference40 articles.
1. A Novel Coronavirus from Patients with Pneumonia in China, 2019
2. Is global BCG vaccination-induced trained immunity relevant to the progression of SARS-CoV-2 pandemic?;Allergy: European Journal of Allergy and Clinical Immunology,2020
3. Fallerini, C. et al. Association of toll-like receptor 7 variants with life-threatening COVID-19 disease in males: Findings from a nested case-control study. Elife 10, (2021).
4. Genetic gateways to COVID-19 infection: Implications for risk, severity, and outcomes;The FASEB Journal,2020
5. Machine learning approaches in COVID-19 diagnosis, mortality, and severity risk prediction: A review;Inform Med Unlocked,2021
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献