SARS-CoV-2 sequence typing, evolution and signatures of selection using CoVa, a Python-based command-line utility-Reference-Cited by-同舟云学术

SARS-CoV-2 sequence typing, evolution and signatures of selection using CoVa, a Python-based command-line utility

Published:2020-06-10 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Ali Farhan,Sharda Mohak,Seshasayee Aswin Sai Narain

Abstract

AbstractThe current global pandemic COVID-19, caused by SARS-CoV-2, has resulted in millions of infections worldwide in a few months. Global efforts to tackle this situation have produced a tremendous body of genomic data, which can be used for tracing transmission routes, characterization of isolates, and monitoring variants with potential for unusual virulence. Several groups have analyzed these genomes using different approaches. However, as new data become available, the research community needs a pipeline to perform a set of routine analyses, that can quickly incorporate new genome sequences and update the analysis reports. We developed a programmatic tool, CoVa, with this objective. It is a fast, accurate and user-friendly utility to perform a variety of genome analyses on hundreds of SARS-CoV-2 sequences. Using CoVa, we define a modified sequence typing nomenclature and identify sites under positive selection. Further analysis identified some peptides and sites showing geographical patterns of selection. Specifically, we show differences in sequence type distribution between sequences from India and those from the rest of the world. We also show that several sites show signatures of positive selection uniquely in sequences from India. Preliminary evolutionary analysis, using features that will be incorporated into CoVa in the near future, show a mutation rate of 7.4 × 10−4 substitutions/site/year, confirm a temporal signal with a November 2019 origin of SARS-CoV-2, and a heterogeneity in the geographical distribution of Indian samples.

Publisher

Cold Spring Harbor Laboratory

Reference37 articles.

1. The species Severe acute respiratory syndrome-related coronavirus: classifying 2019-nCoV and naming it SARS-CoV-2;Nat Microbiol.,2020

2. Laamarti M , Alouane T , Kartti S , Chemao-Elfihri MW , Hakmi M , Essabbar A , et al. Large scale genomic analysis of 3067 SARS-CoV-2 genomes reveals a clonal geodistribution and a rich genetic variations of hotspots mutations. bioRxiv. 2020 May 3;2020.05.03.074567.

3. MAFFT version 5: improvement in accuracy of multiple sequence alignment

4. FastTree 2 – Approximately Maximum-Likelihood Trees for Large Alignments

5. FUBAR: A Fast, Unconstrained Bayesian AppRoximation for Inferring Selection

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. The Impact of Accumulated Mutations in SARS-CoV-2 Variants on the qPCR Detection Efficiency;Frontiers in Cellular and Infection Microbiology;2022-01-28

2. The adenosine analogue prodrug ATV006 is orally bioavailable and has potent preclinical efficacy against SARS-CoV-2 and its variants;2021-10-14

3. PAN-INDIA 1000 SARS-CoV-2 RNA Genome Sequencing Reveals Important Insights into the Outbreak;2020-08-03