Preserving Inversion Phylogeny Reconstruction

Tractability results are rare in the comparison of gene orders for more than two genomes. Here we present a linear-time algorithm for the small parsimony problem (inferring ancestral genomes given a phylogeny on an arbitrary number of genomes) in the case gene orders are permutations, that evolve by inversions not breaking common gene intervals, and these intervals are organised in a linear structure. We present two examples where this allows to reconstruct the ancestral gene orders in phylogenies of several γ-Proteobacteria species and Burkholderia strains, respectively. We prove in addition that the large parsimony problem (where the phylogeny is output) remains NP-complete.

[1]  Pavel A. Pevzner,et al.  Transforming cabbage into turnip: polynomial algorithm for sorting signed permutations by reversals , 1995, JACM.

[2]  David Sankoff,et al.  Multichromosomal median and halving problems under different genomic distances , 2009, BMC Bioinformatics.

[3]  Guillaume Fertin,et al.  Combinatorics of Genome Rearrangements , 2009, Computational molecular biology.

[4]  Kellogg S. Booth,et al.  Testing for the Consecutive Ones Property, Interval Graphs, and Graph Planarity Using PQ-Tree Algorithms , 1976, J. Comput. Syst. Sci..

[5]  C. Paul,et al.  Perfect Sorting by Reversals Is Not Always Difficult , 2007, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[6]  R. Graham,et al.  The steiner problem in phylogeny is NP-complete , 1982 .

[7]  Andrés Moya,et al.  Genome Rearrangement Distances and Gene Order Phylogeny in γ-Proteobacteria , 2005 .

[8]  Rita Casadio,et al.  Algorithms in Bioinformatics, 5th International Workshop, WABI 2005, Mallorca, Spain, October 3-6, 2005, Proceedings , 2005, WABI.

[9]  Jens Stoye,et al.  Finding All Common Intervals of k Permutations , 2001, CPM.

[10]  Matthias Bernt,et al.  CREx: inferring genomic rearrangements based on common intervals , 2007, Bioinform..

[11]  Marni Mishna,et al.  Average-Case Analysis of Perfect Sorting by Reversals , 2009, CPM.

[12]  Mathieu Raffinot,et al.  Computing Common Intervals of K Permutations, with Applications to Modular Decomposition of Graphs , 2005, SIAM J. Discret. Math..

[13]  Nikolay Samusik,et al.  A method for validation for clustering of phenotypic gene knockdown profiles using protein-protein interactions information , 2009, BMC Bioinformatics.

[14]  João Meidanis,et al.  SCJ: A Breakpoint-Like Distance that Simplifies Several Rearrangement Problems , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[15]  M. Bernt,et al.  Gene order rearrangement methods for the reconstruction of phylogeny , 2009 .

[16]  Guy Perrière,et al.  Databases of homologous gene families for comparative genomics , 2009, BMC Bioinformatics.

[17]  David Sankoff,et al.  Edit Distances for Genome Comparisons Based on Non-Local Operations , 1992, CPM.

[18]  Jean-Stéphane Varré,et al.  Sorting by Reversals with Common Intervals , 2004, WABI.

[19]  W. Fitch Toward Defining the Course of Evolution: Minimum Change for a Specific Tree Topology , 1971 .

[20]  David Sankoff,et al.  COMPUTATIONAL COMPLEXITY OF INFERRING PHYLOGENIES BY COMPATIBILITY , 1986 .

[21]  M. Middendorf,et al.  Solving the Preserving Reversal Median Problem , 2008, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[22]  Christophe Paul,et al.  A more efficient algorithm for perfect sorting by reversals , 2008, Inf. Process. Lett..

[23]  Sylvain Gaillard,et al.  Bio++: a set of C++ libraries for sequence analysis, phylogenetics, molecular evolution and population genetics , 2006, BMC Bioinformatics.