Affiliation:
1. National Institute of Chemistry, Hajdrihova 19, 1000 Ljubljana, Slovenia
2. Department of Chemistry and Biochemistry, University of Minnesota, Duluth, United States
3. Centre for Interdisciplinary Research and Education, Kolkata, India
Abstract
Background:
In this report, we consider a data set, which consists of 310 Zika virus
genome sequences taken from different continents, Africa, Asia and South America. The sequences,
which were compiled from GenBank, were derived from the host cells of different mammalian
species (Simiiformes, Aedes opok, Aedes africanus, Aedes luteocephalus, Aedes dalzieli, Aedes
aegypti, and Homo sapiens).
Method:
For chemometrical treatment, the sequences have been represented by sequence descriptors
derived from their graphs or neighborhood matrices. The set was analyzed with three
chemometrical methods: Mahalanobis distances, principal component analysis (PCA) and self organizing
maps (SOM). A good separation of samples with respect to the region of origin was observed
using these three methods.
Results:
Study of 310 Zika virus genome sequences from different continents. To characterize and
compare Zika virus sequences from around the world using alignment-free sequence comparison
and chemometrical methods.
Conclusion:
Mahalanobis distance analysis, self organizing maps, principal components were
used to carry out the chemometrical analyses of the Zika sequence data. Genome sequences are
clustered with respect to the region of origin (continent, country). Africa samples are well separated
from Asian and South American ones.
Publisher
Bentham Science Publishers Ltd.
Subject
Drug Discovery,Molecular Medicine,General Medicine
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献