Abstract
AbstractHorizontal gene transfer is an important contributor to evolution. Following Walter M. Fitch, two genes are xenologs if at least one HGT separates them. More formally, the directed Fitch graph has a set of genes as its vertices, and directed edges (x, y) for all pairs of genesxandyfor whichyhas been horizontally transferred at least once since it diverged from the last common ancestor ofxandy. Subgraphs of Fitch graphs can be inferred by comparative sequence analysis. In many cases, however, only partial knowledge about the “full” Fitch graph can be obtained. Here, we characterize Fitch-satisfiable graphs that can be extended to a biologically feasible “full” Fitch graph and derive a simple polynomial-time recognition algorithm. We then proceed to show that several versions of finding the Fitch graph with total maximum (confidence) edge-weights are NP-hard. In addition, we provide a greedy-heuristic for “optimally” recovering Fitch graphs from partial ones. Somewhat surprisingly, even if ∼ 80% of information of the underlying input Fitch-graphGis lost (i.e., the partial Fitch graph contains only ∼ 20% of the edges ofG), it is possible to recover ∼ 90% of the original edges ofGon average.
Publisher
Cold Spring Harbor Laboratory