Sequence signatures within the genome of SARS-CoV-2 can be used to predict host source

Author:

Rudar Josip12ORCID,Kruczkiewicz Peter1ORCID,Vernygora Oksana1ORCID,Golding G. Brian3ORCID,Hajibabaei Mehrdad2ORCID,Lung Oliver14ORCID

Affiliation:

1. National Centre for Foreign Animal Disease, Canadian Food Inspection Agency, Winnipeg, Manitoba, Canada

2. Department of Integrative Biology & Centre for Biodiversity Genomics, University of Guelph, Guelph, Ontario, Canada

3. Department of Biology, McMaster University, Hamilton, Ontario, Canada

4. Department of Biological Sciences, University of Manitoba, Winnipeg, Manitoba, Canada

Abstract

ABSTRACT We conducted an in silico analysis to better understand the potential factors impacting host adaptation of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) in white-tailed deer, humans, and mink due to the strong evidence of sustained transmission within these hosts. Classification models trained on single nucleotide and amino acid differences between samples effectively identified white-tailed deer-, human-, and mink-derived SARS-CoV-2. For example, the balanced accuracy score of Extremely Randomized Trees classifiers was 0.984 ± 0.006. Eighty-eight commonly identified predictive mutations are found at sites under strong positive and negative selective pressure. A large fraction of sites under selection (86.9%) or identified by machine learning (87.1%) are found in genes other than the spike. Some locations encoded by these gene regions are predicted to be B- and T-cell epitopes or are implicated in modulating the immune response suggesting that host adaptation may involve the evasion of the host immune system, modulation of the class-I major-histocompatibility complex, and the diminished recognition of immune epitopes by CD8+ T cells. Our selection and machine learning analysis also identified that silent mutations, such as C7303T and C9430T, play an important role in discriminating deer-derived samples across multiple clades. Finally, our investigation into the origin of the B.1.641 lineage from white-tailed deer in Canada discovered an additional human sequence from Michigan related to the B.1.641 lineage sampled near the emergence of this lineage. These findings demonstrate that machine-learning approaches can be used in combination with evolutionary genomics to identify factors possibly involved in the cross-species transmission of viruses and the emergence of novel viral lineages. IMPORTANCE Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is a highly transmissible virus capable of infecting and establishing itself in human and wildlife populations, such as white-tailed deer. This fact highlights the importance of developing novel ways to identify genetic factors that contribute to its spread and adaptation to new host species. This is especially important since these populations can serve as reservoirs that potentially facilitate the re-introduction of new variants into human populations. In this study, we apply machine learning and phylogenetic methods to uncover biomarkers of SARS-CoV-2 adaptation in mink and white-tailed deer. We find evidence demonstrating that both non-synonymous and silent mutations can be used to differentiate animal-derived sequences from human-derived ones and each other. This evidence also suggests that host adaptation involves the evasion of the immune system and the suppression of antigen presentation. Finally, the methods developed here are general and can be used to investigate host adaptation in viruses other than SARS-CoV-2.

Funder

Food from Thought

Canadian Safety and Security Program

Canadian Government | Natural Sciences and Engineering Research Council of Canada

Canadian Government | Environment and Climate Change Canada's Grants and Contributions Program

Publisher

American Society for Microbiology

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3