Abstract
Among mutations that occur in SARS-CoV-2, efficient identification of mutations advantageous for viral replication and transmission is important to characterize and defeat this rampant virus. Mutations rapidly expanding frequency in a viral population are candidates for advantageous mutations, but neutral mutations hitchhiking with advantageous mutations are also likely to be included. To distinguish these, we focus on mutations that appear to occur independently in different lineages and expand in frequency in a convergent evolutionary manner. Batch-learning SOM (BLSOM) can separate SARS-CoV-2 genome sequences according by lineage from only providing the oligonucleotide composition. Focusing on remarkably expanding 20-mers, each of which is only represented by one copy in the viral genome, allows us to correlate the expanding 20-mers to mutations. Using visualization functions in BLSOM, we can efficiently identify mutations that have expanded remarkably both in the Omicron lineage, which is phylogenetically distinct from other lineages, and in other lineages. Most of these mutations involved changes in amino acids, but there were a few that did not, such as an intergenic mutation.
Funder
Japan Science and Technology Agency
The Japan Society for the Promotion of Science
Publisher
Public Library of Science (PLoS)
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献