Fitting Distances by Tree Metrics Minimizing the Total Error within a Constant Factor


Cohen-Addad Vincent1ORCID,Das Debarati2ORCID,Kipouridis Evangelos2ORCID,Parotsidis Nikos3ORCID,Thorup Mikkel2ORCID


1. Google Research, Switzerland

2. Department of Computer Science, University of Copenhagen, Copenhagen, Denmark

3. Google Research, Zurich, Switzerland


We consider the numerical taxonomy problem of fitting a positive distance function \({\mathcal {D}:{S\choose 2}\rightarrow \mathbb {R}_{\gt 0}}\) by a tree metric. We want a tree T with positive edge weights and including S among the vertices so that their distances in T match those in \(\mathcal {D}\) . A nice application is in evolutionary biology where the tree T aims to approximate thebranching process leading to the observed distances in \(\mathcal {D}\) [Cavalli-Sforza and Edwards 1967]. We consider the total error, that is, the sum of distance errors over all pairs of points. We present a deterministic polynomial time algorithm minimizing the total error within a constant factor. We can do this both for general trees and for the special case of ultrametrics with a root having the same distance to all vertices in S . The problems are APX-hard, so a constant factor is the best we can hope for in polynomial time. The best previous approximation factor was O ((log n )(log log n )) by Ailon and Charikar [2005], who wrote “determining whether an O (1) approximation can be obtained is a fascinating question.”



Basic Algorithms Research Copenhagen

European Union’s Horizon 2020


Association for Computing Machinery (ACM)

Reference54 articles.

1. Amir Abboud, Vincent Cohen-Addad, and Hussein Houdrouge. 2019. Subquadratic high-dimensional hierarchical clustering. In NeurIPS. 11576–11586.

2. On the Approximability of Numerical Taxonomy (Fitting Distances by Tree Metrics)

3. Fitting Tree Metrics: Hierarchical Clustering and Phylogeny

4. Aggregating inconsistent information

5. Noga Alon, Yossi Azar, and Danny Vainstein. 2020. Hierarchical clustering: A 0.585 revenue approximation. In COLT, Vol. 125. 153–162.







Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3