PhyloMap: an algorithm for visualizing relationships of large sequence data sets and its application to the influenza A virus genome-Reference-Cited by-同舟云学术

PhyloMap: an algorithm for visualizing relationships of large sequence data sets and its application to the influenza A virus genome

Published:2011-06-20 Issue:1 Volume:12 Page:
ISSN:1471-2105
Container-title:BMC Bioinformatics
language:en
Short-container-title:BMC Bioinformatics

Author:

Zhang Jiajie,Mamlouk Amir Madany,Martinetz Thomas,Chang Suhua,Wang Jing,Hilgenfeld Rolf

Abstract

Abstract Background Results of phylogenetic analysis are often visualized as phylogenetic trees. Such a tree can typically only include up to a few hundred sequences. When more than a few thousand sequences are to be included, analyzing the phylogenetic relationships among them becomes a challenging task. The recent frequent outbreaks of influenza A viruses have resulted in the rapid accumulation of corresponding genome sequences. Currently, there are more than 7500 influenza A virus genomes in the database. There are no efficient ways of representing this huge data set as a whole, thus preventing a further understanding of the diversity of the influenza A virus genome. Results Here we present a new algorithm, "PhyloMap", which combines ordination, vector quantization, and phylogenetic tree construction to give an elegant representation of a large sequence data set. The use of PhyloMap on influenza A virus genome sequences reveals the phylogenetic relationships of the internal genes that cannot be seen when only a subset of sequences are analyzed. Conclusions The application of PhyloMap to influenza A virus genome data shows that it is a robust algorithm for analyzing large sequence data sets. It utilizes the entire data set, minimizes bias, and provides intuitive visualization. PhyloMap is implemented in JAVA, and the source code is freely available at http://www.biochem.uni-luebeck.de/public/software/phylomap.html

Publisher

Springer Science and Business Media LLC

Subject

Applied Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Structural Biology

Link

https://link.springer.com/content/pdf/10.1186/1471-2105-12-248.pdf

Reference58 articles.

1. Procter JB, Thompson J, Letunic I, Creevey C, Jossinet F, Barton GJ: Visualization of multiple alignments, phylogenies and gene family evolution. Nat Methods 2010, 7: S16–25. 10.1038/nmeth.1434

2. Pavlopoulos GA, Soldatos TG, Barbosa-Silva A, Schneider R: A reference guide for tree analysis and visualization. BioData Min 2010, 3: 1. 10.1186/1756-0381-3-1

3. Chen JM, Sun YX, Chen JW, Liu S, Yu JM, Shen CJ, Sun XD, Peng D: Panorama phylogenetic diversity and distribution of type A influenza viruses based on their six internal gene sequences. Virol J 2009, 6: 137. 10.1186/1743-422X-6-137

4. Garten RJ, Davis CT, Russell CA, Shu B, Lindstrom S, Balish A, Sessions WM, Xu X, Skepner E, Deyde V, et al.: Antigenic and genetic characteristics of swine-origin 2009 A(H1N1) influenza viruses circulating in humans. Science 2009, 325: 197–201. 10.1126/science.1176225

5. Smith GJ, Vijaykrishna D, Bahl J, Lycett SJ, Worobey M, Pybus OG, Ma SK, Cheung CL, Raghwani J, Bhatt S, et al.: Origins and evolutionary genomics of the 2009 swine-origin H1N1 influenza A epidemic. Nature 2009, 459: 1122–1125. 10.1038/nature08182

Cited by 19 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Genetic Diversity in Natural Populations of the Near-Threatened Species Lignobrycon myersi (Characiformes, Triportheidae): Implications for Species Conservation;Zebrafish;2023-12-01

2. Taxonomic Delineation of the Old World Species Stomphastis thraustica (Lepidoptera: Gracillariidae) Feeding on Jatropha gossypiifolia (Euphorbiaceae) that Was Collected in the New World and Imported as a Biocontrol Agent to Australia;Neotropical Entomology;2022-10-17

3. Environmental genomics of Late Pleistocene black bears and giant short-faced bears;Current Biology;2021-06

4. Systematics of Brucepattersonius Hershkovitz, 1998 (Rodentia, Sigmodontinae): molecular species delimitation and morphological analyses suggest an overestimation in species diversity;Systematics and Biodiversity;2021-04-01

5. DISSEQT—DIStribution-based modeling of SEQuence space Time dynamics†;Virus Evolution;2019-07-01