Affiliation:
1. Liferiver Science and Technology Institute
Abstract
Abstract
The lack of virus fossilization precludes any references or ancestors for inferring evolutionary processes, and viruses have no cell structure, metabolism, or space to reproduce outside host cells. Most mutations yielding high pathogenicity become removed from the population, but adaptive mutations could be epidemically transmitted and fixed in the population. Therefore, determining how viruses originated, how they diverged and how an infectious disease was transmitted are serious challenges. To predict potential epidemic outbreaks, we tested our strategy, Epi-Clock, which applies the ZHU algorithm on different SARS-CoV-2 datasets before outbreaks to search for real significant mutational accumulation patterns correlated with the outbreak events. We imagine that specific amino acid substitutions are triggers for outbreaks. Surprisingly, some inter-species genetic distances of Coronaviridae were shorter than the intra-species distances, which may represent the intermediate states of different species or subspecies in the evolutionary history of Coronaviridae. The insertions and deletions of whole genome sequences between different hosts were separately associated with new functions or turning points, clearly indicating their important roles in the host transmission and shifts of Coronaviridae. Furthermore, we believe that non-nucleosomal DNA may play dominant roles in the divergence of different lineages of SARS-CoV-2 in different regions of the world because of the lack of nucleosome protection. We suggest that strong selective variation among different lineages of SARS-CoV-2 is required to produce strong codon usage bias, significantly appear in B.1.640.2 and B.1.617.2 (Delta). Interestingly, we found that an increasing number of other types of substitutions, such as those resulting from the hitchhiking effect, have accumulated, especially in the pre-breakout phase, even though some previous substitutions were replaced by other dominant genotypes. From most validations, we could accurately predict the potential pre-phase of outbreaks with a median interval of 5 days before. Using our pipeline, users may review updated information on the website https://bioinfo.liferiver.com.cn with easy registration.
Publisher
Research Square Platform LLC
Reference53 articles.
1. The origin of viruses;Forterre P;Research in Microbiology,2009
2. New genes drive the evolution of gene interaction networks in the human and mouse genomes;Zhang W;Genome Biology,2015
3. Dating genomic variants and shared ancestry in population-scale sequencing data;Albers PK;PLoS Biol,2020
4. Family level phylogenies reveal modes of macroevolution in RNA viruses;Kitchen A;Proceedings of the National Academy of Sciences of the United States of America,2011
5. Selection and Neutral Mutations Drive Pervasive Mutability Losses in Long-Lived Anti-HIV B-Cell Lineages. Molecular biology and evolution;Vieira MC,2018