MixtureFinder: Estimating DNA mixture models for phylogenetic analyses-Reference-Cited by-同舟云学术

MixtureFinder: Estimating DNA mixture models for phylogenetic analyses

Published:2024-03-21 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Ren Huaiyan^ORCID,Wong Thomas KF^ORCID,Minh Bui Quang^ORCID,Lanfear Robert^ORCID

Abstract

AbstractIn phylogenetic studies, both partitioned models and mixture models are used to account for heterogeneity in molecular evolution among the sites of DNA sequence alignments. Partitioned models require the user to specify the grouping of sites into subsets, and then assume that each subset of sites can be modelled by a single common process. Mixture models do not require users to pre-specify subsets of sites, and instead calculate the likelihood of every site under every model, while co-estimating the model weights. While much research has gone into the optimisation of partitioned models by merging user-specified subsets, there has been less attention paid to the optimisation of mixture models for DNA sequence alignments. In this study, we first ask whether a key assumption of partitioned models – that each user-specified subset can be modelled by a single common process – is supported by the data. Having shown that this is not the case, we then design, implement, test, and apply an algorithm, MixtureFinder, to select the optimum number of classes for a mixture model of Q matrices for the standard models of DNA sequence evolution. We show this algorithm performs well on simulated and empirical datasets and suggest that it may be useful for future empirical studies. MixtureFinder is available in IQ-TREE2, and a tutorial for using MixtureFinder can be found here:http://www.iqtree.org/doc/Complex-Models#mixture-models.

Publisher

Cold Spring Harbor Laboratory

Reference63 articles.

1. Model selection may not be a mandatory step for phylogeny reconstruction;Nature communications,2019

2. A new look at the statistical model identification

3. Should we be worried about long-branch attraction in real data sets? Investigations using metazoan 18S rDNA

4. Is Over-parameterization a Problem for Profile Mixture Models?

5. Ultraconserved elements in the human genome. Science (New York;N.Y,2004