Affiliation:
1. Department of Biological Sciences, Indian Institute of Science Education & Research, Mohali, India
2. Illawarra
Shoalhaven Local Health District (ISLHD), NSW Health, Australia
Abstract
Background:
Viruses have high mutation rates, facilitating rapid evolution and the
emergence of new species, subspecies, strains and recombinant forms. Accurate classification of
these forms is crucial for understanding viral evolution and developing therapeutic applications.
Phylogenetic classification is typically performed by analyzing molecular differences at the genomic
and sub-genomic levels. This involves aligning homologous proteins or genes. However,
there is growing interest in developing alignment-free methods for whole-genome comparisons that
are computationally efficient.
Methods:
Here we elaborate on the Chaos Game Representation (CGR) method, based on concepts
of statistical physics and free of sequence alignment assumptions. We adopt the CGR method for
classification of the closely related clades/lineages A and B of the SARS-Corona virus 2019
(SARS-CoV-2), which is one of the fastest evolving viruses.
Results:
Our study shows that the CGR approach can easily yield the SARS-CoV-2 phylogeny
from the available whole genomes of lineage A and lineage B sequences. It also shows an accurate
classification of eight different strains and the newly evolved XBB variant from its parental strains.
Compared to alignment-based methods (Neighbour-Joining and Maximum Likelihood), the CGR
method requires low computational resources, is fast and accurate for long sequences, and, being a
K-mer based approach, allows simultaneous comparison of a large number of closely-related sequences
of different sizes. Further, we developed an R pipeline CGRphylo, available on GitHub,
which integrates the CGR module with various other R packages to create phylogenetic trees and
visualize them.
Conclusion:
Our findings demonstrate the efficacy of the CGR method for accurate classification
and tracking of rapidly evolving viruses, offering valuable insights into the evolution and emergence
of new SARS-CoV-2 strains and recombinants.
Publisher
Bentham Science Publishers Ltd.
Subject
Genetics (clinical),Genetics