An Alignment-Independent Approach for the Study of Viral Sequence Diversity at Any Given Rank of Taxonomy Lineage-Reference-Cited by-同舟云学术

An Alignment-Independent Approach for the Study of Viral Sequence Diversity at Any Given Rank of Taxonomy Lineage

Published:2021-08-31 Issue:9 Volume:10 Page:853
ISSN:2079-7737
Container-title:Biology
language:en
Short-container-title:Biology

Author:

Chong Li Chuin^ORCID,Lim Wei Lun^ORCID,Ban Kenneth Hon Kim,Khan Asif M.^ORCID

Abstract

The study of viral diversity is imperative in understanding sequence change and its implications for intervention strategies. The widely used alignment-dependent approaches to study viral diversity are limited in their utility as sequence dissimilarity increases, particularly when expanded to the genus or higher ranks of viral species lineage. Herein, we present an alignment-independent algorithm, implemented as a tool, UNIQmin, to determine the effective viral sequence diversity at any rank of the viral taxonomy lineage. This is done by performing an exhaustive search to generate the minimal set of sequences for a given viral non-redundant sequence dataset. The minimal set is comprised of the smallest possible number of unique sequences required to capture the diversity inherent in the complete set of overlapping k-mers encoded by all the unique sequences in the given dataset. Such dataset compression is possible through the removal of unique sequences, whose entire repertoire of overlapping k-mers can be represented by other sequences, thus rendering them redundant to the collective pool of sequence diversity. A significant reduction, namely ~44%, ~45%, and ~53%, was observed for all reported unique sequences of species Dengue virus, genus Flavivirus, and family Flaviviridae, respectively, while still capturing the entire repertoire of nonamer (9-mer) viral peptidome diversity present in the initial input dataset. The algorithm is scalable for big data as it was applied to ~2.2 million non-redundant sequences of all reported viruses. UNIQmin is open source and publicly available on GitHub. The concept of a minimal set is generic and, thus, potentially applicable to other pathogenic microorganisms of non-viral origin, such as bacteria.

Funder

Malaysian Medical Association

Publisher

MDPI AG

Subject

General Agricultural and Biological Sciences,General Immunology and Microbiology,General Biochemistry, Genetics and Molecular Biology

Link

https://www.mdpi.com/2079-7737/10/9/853/pdf

Reference45 articles.

1. COVID-19: Emergence, Spread, Possible Treatments, and Global Burden

2. Global burden of 369 diseases and injuries in 204 countries and territories, 1990–2019: a systematic analysis for the Global Burden of Disease Study 2019

3. Pathways to human adaptation

4. SnapShot: Evolution of Human Influenza A Viruses

5. Intracellular Pathogens: Host Immunity and Microbial Persistence Strategies

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A Systematic Bioinformatics Approach for Mapping the Minimal Set of a Viral Peptidome;Current Protocols;2024-06

2. Negligible peptidome diversity of SARS-CoV-2 and its higher taxonomic ranks;2022-11-01

3. UNIQmin, an alignment-free tool to study viral sequence diversity across taxonomic lineages: a case study of monkeypox virus;2022-08-09