Author:
Oxusoff Laurent,Préa Pascal,Perez Yvan
Abstract
AbstractA new method of genomic maps analysis based on formal logic is described. The purpose of the method is to 1) use mitochondrial genomic organisation of current taxa as datasets 2) calculate mutational steps between all mitochondrial gene arrangements and 3) reconstruct phylogenetic relationships according to these calculated mutational steps within a dendrogram under the assumption of maximum parsimony. Unlike existing methods mainly based on the probabilistic approach, the main strength of this new approach is that it calculates all the exact tree solutions with completeness and provides logical consequences as very robust results. Moreover, the method infers all possible hypothetical ancestors and reconstructs character states for all internal nodes (ancestors) of the trees. We started by testing the method using the deuterostomes as a study case. Then, with sponges as an outgroup, we investigated the mutational network of mitochondrial genomes of 47 bilaterian phyla and emphasised the peculiar case of chaetognaths. This pilot work showed that the use of formal logic in a hypothetico-deductive background such as phylogeny (where experimental testing of hypotheses is impossible) is very promising to explore mitochondrial gene rearrangements in deuterostomes and should be applied to many other bilaterian clades.Author SummaryInvestigating how recombination might modify gene arrangements during the evolution of metazoans has become a routine part of mitochondrial genome analysis. In this paper, we present a new approach based on formal logic that provides optimal solutions in the genome rearrangement field. In particular, we improve the sorting by including all rearrangement events, e.g., transposition, inversion and reverse transposition. The problem we face with is to find the most parsimonious tree(s) explaining all the rearrangement events from a common ancestor to all the descendants of a given clade (hereinafter PHYLO problem). So far, a complete approach to find all the correct solutions of PHYLO is not available. Formal logic provides an elegant way to represent and solve such an NP-hard problem. It has the benefit of correctness, completeness and allows the understanding of the logical consequences (results true for all solutions found). First, one must define PHYLO (axiomatisation) with a set of logic formulas or constraints. Second, a model generator calculates all the models, each model being a solution of PHYLO. Several complete model generators are available but a recurring difficulty is the computation time when the data set increases. When the search of a solution takes exponential time, two computing strategies are conceivable: an incomplete but fast algorithm that does not provide the optimal solution (for example, use local improvements from an initial random solution) or a complete – and thus not efficient – algorithm on a smaller tractable dataset. While the large amount of genes found in the nuclear genome strongly limits our possibility to use of formal logic with any conventional computer, we show in our paper that, for bilaterian mtDNAs, all the correct solutions can be found in a reasonable time due to the small number of genes.
Publisher
Cold Spring Harbor Laboratory