Abstract
Ancestral reconstruction is a classic task in comparative genomics. Here, we study the genome median problem, a related computational problem which, given a set of three or more genomes, asks to find a new genome that minimizes the sum of pairwise distances between it and the given genomes. The distance stands for the amount of evolution observed at the genome level, for which we determine the minimum number of rearrangement operations necessary to transform one genome into the other. For almost all rearrangement operations the median problem is NP-hard, with the exception of the breakpoint median that can be constructed efficiently for multichromosomal circular and mixed genomes. In this work, we study the median problem under a restricted rearrangement measure called c4-distance, which is closely related to the breakpoint and the DCJ distance. We identify tight bounds and decomposers of the c4-median and develop algorithms for its construction, one exact ILP-based and three combinatorial heuristics. Subsequently, we perform experiments on simulated data sets. Our results suggest that the c4-distance is useful for the study the genome median problem, from theoretical and practical perspectives.
Subject
Management Science and Operations Research,Computer Science Applications,Theoretical Computer Science