Affiliation:
1. Naval Postgraduate School, Monterey, CA 93943-5219, USA
Abstract
When we apply comparative phylogenetic analyses to genome data, it poses a significant problem and challenge that some of the given species (or taxa) often have missing genes (i.e., data). In such a case, we have to impute a missing part of a gene tree from a sample of gene trees. In this short paper, we propose a novel method to infer the missing part of a phylogenetic tree using an analogue of a classical linear regression in the setting of tropical geometry. In our approach, we consider a tropical polytope, a convex hull with respect to the tropical metric closest to the data points. We show a condition that we can guarantee that an estimated tree from the method has at most a Robinson–Foulds (RF) distance of four from the ground truth, and computational experiments with simulated data and empirical data from Clavicipitaceae, which contains more than 4000 genes, show the method works well.
Subject
General Mathematics,Engineering (miscellaneous),Computer Science (miscellaneous)
Reference19 articles.
1. Comparison of phylogenetic trees and search for a central trend in the “forest of life”;Koonin;J. Comput. Biol.,2011
2. Phylogenetic placement and life history trait imputation for Grenada Dove Leptotila wellsi, Generalized fuzzy trees;Peters;Int. J. Comput. Intell. Syst.,2023
3. imPhy: Imputing Phylogenetic Trees with Missing Information Using Mathematical Programming;Yasui;IEEE/ACM Trans. Comput. Biol. Bioinform.,2020
4. Gene Trees in Species Trees;Maddison;Syst. Biol.,1997
5. The Bergman Complex of a Matroid and Phylogenetic Trees;Ardila;J. Comb. Theory Ser. B,2006
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献