Abstract
AbstractHomogeneity across lineages is a general assumption in phylogenetics according to which nucleotide substitution rates are common to all lineages. Many phylogenetic methods relax this hypothesis but keep a simple enough model to make the process of sequence evolution more tractable. On the other hand, dealing successfully with the general case (heterogeneity of rates across lineages) is one of the key features of phylogenetic reconstruction methods based on algebraic tools. The goal of this paper is twofold. First, we present a new weighting system for quartets () based on algebraic and semi-algebraic tools, thus especially indicated to deal with data evolving under heterogeneous rates. This method combines the weights of two previous methods by means of a test based on the positivity of the branch lengths estimated with the paralinear distance. is statistically consistent when applied to data generated under the general Markov model, considers rate and base composition heterogeneity among lineages and does not assume stationarity nor time-reversibility. Second, we test and compare the performance of several quartet-based methods for phylogenetic tree reconstruction (namely QFM, wQFM, quartet puzzling, weight optimization and Willson’s method) in combination with several systems of weights, including weights and other weights based on algebraic and semi-algebraic methods or on the paralinear distance. These tests are applied to both simulated and real data and support weight optimization with weights as a reliable and successful reconstruction method that improves upon the accuracy of global methods (such as neighbor-joining or maximum likelihood) in the presence of long branches or on mixtures of distributions on trees.
Funder
Agencia Estatal de Investigación
Agència de Gestió d’Ajuts Universitaris i de Recerca
Publisher
Springer Science and Business Media LLC
Subject
Computational Theory and Mathematics,General Agricultural and Biological Sciences,Pharmacology,General Environmental Science,General Biochemistry, Genetics and Molecular Biology,General Mathematics,Immunology,General Neuroscience
Reference60 articles.
1. Abadi S, Azouri D, Pupko T, Mayrose I (2019) Model selection may not be a mandatory step for phylogeny reconstruction. Nat Commun 10:934
2. Allman ES, Banos H, Rhodes JA (2022) Identifiability of species network topologies from genomic sequences using the logdet distance. J Math Bio 84:35
3. Allman ES, Rhodes JA (2007) Phylogenetic invariants. In: Gascuel O, Steel MA (eds) Reconstructing evolution. Oxford University Press, Oxford
4. Allman ES, Rhodes JA, Taylor A (2014) A semialgebraic description of the general Markov model on phylogenetic trees. SIAM J Discret Math 28(2):736–755
5. Allman ES, Kubatko LS, Rhodes JA (2016) Split scores: a tool to quantify phylogenetic signal in genome-scale data. Syst Biol 66(4):syw103
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献