Generating Minimal Models of H1N1 NS1 Gene Sequences Using Alignment-Based and Alignment-Free Algorithms-Reference-Cited by-同舟云学术

Generating Minimal Models of H1N1 NS1 Gene Sequences Using Alignment-Based and Alignment-Free Algorithms

Published:2023-01-10 Issue:1 Volume:14 Page:186
ISSN:2073-4425
Container-title:Genes
language:en
Short-container-title:Genes

Author:

Fang Meng^ORCID,Xu Jiawei,Sun Nan,Yau Stephen S.-T.^ORCID

Abstract

For virus classification and tracing, one idea is to generate minimal models from the gene sequences of each virus group for comparative analysis within and between classes, as well as classification and tracing of new sequences. The starting point of defining a minimal model for a group of gene sequences is to find their longest common sequence (LCS), but this is a non-deterministic polynomial-time hard (NP-hard) problem. Therefore, we applied some heuristic approaches of finding LCS, as well as some of the newer methods of treating gene sequences, including multiple sequence alignment (MSA) and k-mer natural vector (NV) encoding. To evaluate our algorithms, a five-fold cross validation classification scheme on a dataset of H1N1 virus non-structural protein 1 (NS1) gene was analyzed. The results indicate that the MSA-based algorithm has the best performance measured by classification accuracy, while the NV-based algorithm exhibits advantages in the time complexity of generating minimal models.

Funder

National Natural Science Foundation of China

Tsinghua University Education Foundation

Publisher

MDPI AG

Subject

Genetics (clinical),Genetics

Link

https://www.mdpi.com/2073-4425/14/1/186/pdf

Reference23 articles.

1. The biology of influenza viruses;Bouvier;Vaccine,2008

2. A brief review of influenza virus infection;Javanian;J. Med. Virol.,2021

3. The 2009 A (H1N1) influenza virus pandemic: A review;Girard;Vaccine,2010

4. Origins and evolutionary genomics of the 2009 swine-origin H1N1 influenza A epidemic;Smith;Nature,2009

5. Influenza;Krammer;Nat. Rev. Dis. Prim.,2018