Interpretable and Predictive Deep Neural Network Modeling of the SARS-CoV-2 Spike Protein Sequence to Predict COVID-19 Disease Severity-Reference-Cited by-同舟云学术

Interpretable and Predictive Deep Neural Network Modeling of the SARS-CoV-2 Spike Protein Sequence to Predict COVID-19 Disease Severity

Published:2022-12-08 Issue:12 Volume:11 Page:1786
ISSN:2079-7737
Container-title:Biology
language:en
Short-container-title:Biology

Author:

Sokhansanj Bahrad A.^ORCID,Zhao Zhengqiao,Rosen Gail L.^ORCID

Abstract

Through the COVID-19 pandemic, SARS-CoV-2 has gained and lost multiple mutations in novel or unexpected combinations. Predicting how complex mutations affect COVID-19 disease severity is critical in planning public health responses as the virus continues to evolve. This paper presents a novel computational framework to complement conventional lineage classification and applies it to predict the severe disease potential of viral genetic variation. The transformer-based neural network model architecture has additional layers that provide sample embeddings and sequence-wide attention for interpretation and visualization. First, training a model to predict SARS-CoV-2 taxonomy validates the architecture’s interpretability. Second, an interpretable predictive model of disease severity is trained on spike protein sequence and patient metadata from GISAID. Confounding effects of changing patient demographics, increasing vaccination rates, and improving treatment over time are addressed by including demographics and case date as independent input to the neural network model. The resulting model can be interpreted to identify potentially significant virus mutations and proves to be a robust predctive tool. Although trained on sequence data obtained entirely before the availability of empirical data for Omicron, the model can predict the Omicron’s reduced risk of severe disease, in accord with epidemiological and experimental data.

Funder

National Science Foundation

Publisher

MDPI AG

Subject

General Agricultural and Biological Sciences,General Immunology and Microbiology,General Biochemistry, Genetics and Molecular Biology

Link

https://www.mdpi.com/2079-7737/11/12/1786/pdf

Reference113 articles.

1. GISAID’s Role in Pandemic Response;Khare;China CDC Wkly.,2021

2. A National Strategy for the “New Normal” of Life With COVID;Emanuel;JAMA,2022

3. Mapping Data to Deep Understanding: Making the Most of the Deluge of SARS-CoV-2 Genome Sequences;Sokhansanj;mSystems,2022

4. Gene of the Month: The 2019-nCoV/SARS-CoV-2 Novel Coronavirus Spike Protein;Pillay;J. Clin. Pathol.,2020

5. Structure, Function, and Antigenicity of the SARS-CoV-2 Spike Glycoprotein;Walls;Cell,2020

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Leveraging Large Language Models for Metagenomic Analysis;2023 IEEE Signal Processing in Medicine and Biology Symposium (SPMB);2023-12-02

2. An Epidemiological Analysis for Assessing and Evaluating COVID-19 Based on Data Analytics in Latin American Countries;Biology;2023-06-20

3. The gray swan: model-based assessment of the risk of sudden failure of hybrid immunity to SARS-CoV-2;2023-03-01

4. CoVEffect: interactive system for mining the effects of SARS-CoV-2 mutations and variants based on deep learning;GigaScience;2022-12-28