Abstract
AbstractMaximum likelihood estimation is among the most widely-used methods for inferring phylogenetic trees from sequence data. This paper solves the problem of computing solutions to the maximum likelihood problem for 3-leaf trees under the 2-state symmetric mutation model (CFN model). Our main result is a closed-form solution to the maximum likelihood problem for unrooted 3-leaf trees, given generic data; this result characterizes all of the ways that a maximum likelihood estimate can fail to exist for generic data and provides theoretical validation for predictions made in [28]. Our proof makes use of both classical tools for studying group-based phylogenetic models such as Hadamard conjugation and reparameterization in terms of Fourier coordinates, as well as more recent results concerning the semi-algebraic constraints of the CFN model. To be able to put these into practice, we also give a complete characterization to test genericity.
Publisher
Cold Spring Harbor Laboratory