An Exact Algorithm for Sorting by Weighted Preserving Genome Rearrangements

The preserving Genome Sorting Problem (pGSP) asks for a shortest sequence of rearrangement operations that transforms a given gene order into another given gene order by using rearrangement operations that preserve common intervals, i.e., groups of genes that form an interval in both given gene orders. The wpGSP is the weighted version of the problem were each type of rearrangement operation has a weight and a minimum weight sequence of rearrangement operations is sought. An exact algorithm – called CREx2 – is presented, which solves the wpGSP for arbitrary gene orders and the following types of rearrangement operations: inversions, transpositions, inverse transpositions, and tandem duplication random loss operations. CREx2 has a (worst case) exponential runtime, but a linear runtime for problem instances where the common intervals are organized in a linear structure. The efficiency of CREx2 and its usefulness for phylogenetic analysis is shown empirically for gene orders of fungal mitochondrial genomes.

[1]  Krister M. Swenson,et al.  Inversion-based genomic signatures , 2009, BMC Bioinformatics.

[2]  Tzvika Hartman,et al.  A 1.375-Approximation Algorithm for Sorting by Transpositions , 2005, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[3]  Sangeeta Bhatia,et al.  Position and Content Paradigms in Genome Rearrangements: The Wild and Crazy World of Permutations in Genomics , 2016, Bulletin of mathematical biology.

[4]  P. Stadler,et al.  Genetic aspects of mitochondrial genome evolution. , 2013, Molecular phylogenetics and evolution.

[5]  Jean-Stéphane Varré,et al.  Sorting by Reversals with Common Intervals , 2004, WABI.

[6]  C. Paul,et al.  Perfect Sorting by Reversals Is Not Always Difficult , 2007, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[7]  Marni Mishna,et al.  Average-Case Analysis of Perfect Sorting by Reversals , 2009, Discret. Math. Algorithms Appl..

[8]  Matthias Bernt,et al.  A method for computing an inventory of metazoan mitochondrial gene order rearrangements , 2011, BMC Bioinformatics.

[9]  Krister M. Swenson,et al.  Maximum independent sets of commuting and noninterfering inversions , 2009, BMC Bioinformatics.

[10]  Satish Rao,et al.  On the tandem duplication-random loss model of genome rearrangement , 2006, SODA '06.

[11]  Laxmi Parida,et al.  A PQ Framework for Reconstructions of Common Ancestors and Phylogeny , 2006, Comparative Genomics.

[12]  Cedric Chauve,et al.  Conservation of Combinatorial Structures in Evolution Scenarios , 2004, Comparative Genomics.

[13]  Jens Stoye,et al.  Finding All Common Intervals of k Permutations , 2001, CPM.

[14]  D. Sankoff,et al.  Parametric genome rearrangement. , 1996, Gene.

[15]  S. B. Needleman,et al.  A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.

[16]  Jens Stoye,et al.  Algorithms for Finding Gene Clusters , 2001, WABI.

[17]  R. Zardoya,et al.  A hotspot of gene order rearrangement by tandem duplication and random loss in the vertebrate mitochondrial genome. , 2005, Molecular biology and evolution.

[18]  Bartek Wilczynski,et al.  Biopython: freely available Python tools for computational molecular biology and bioinformatics , 2009, Bioinform..

[19]  Matthias Bernt,et al.  Genome Rearrangement Analysis: Cut and Join Genome Rearrangements and Gene Cluster Preserving Approaches. , 2018, Methods in molecular biology.

[20]  Christophe Paul,et al.  A more efficient algorithm for perfect sorting by reversals , 2008, Inf. Process. Lett..

[21]  Matthias Bernt,et al.  CREx: inferring genomic rearrangements based on common intervals , 2007, Bioinform..

[22]  Jens Stoye,et al.  Common intervals and sorting by reversals: a marriage of necessity , 2002, ECCB.

[23]  Jens Stoye,et al.  Common Intervals of Multiple Permutations , 2011, Algorithmica.

[24]  Matthias Bernt,et al.  Combinatorics of Tandem Duplication Random Loss Mutations on Circular Genomes , 2018, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[25]  Richard Friedberg,et al.  Efficient sorting of genomic permutations by translocation, inversion and block interchange , 2005, Bioinform..

[26]  Pavel A. Pevzner,et al.  Transforming cabbage into turnip: polynomial algorithm for sorting signed permutations by reversals , 1995, JACM.

[27]  David A. Christie,et al.  Sorting Permutations by Block-Interchanges , 1996, Inf. Process. Lett..

[28]  Guillaume Fertin,et al.  Sorting by Transpositions Is Difficult , 2010, SIAM J. Discret. Math..

[29]  Roded Sharan,et al.  Genome Rearrangement with ILP , 2018, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[30]  Tatiana Tatusova,et al.  NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins , 2004, Nucleic Acids Res..

[31]  Annie Chateau,et al.  Computation of Perfect DCJ Rearrangement Scenarios with Linear and Circular Chromosomes , 2009, J. Comput. Biol..

[32]  P. Stadler,et al.  MITOS: improved de novo metazoan mitochondrial genome annotation. , 2013, Molecular phylogenetics and evolution.

[33]  Christophe Paul,et al.  Perfect Sorting by Reversals Is Not Always Difficult , 2005, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[34]  Mathieu Raffinot,et al.  Computing Common Intervals of K Permutations, with Applications to Modular Decomposition of Graphs , 2005, SIAM J. Discret. Math..

[35]  Steven Skiena,et al.  Improved bounds on sorting with length-weighted reversals , 2004, SODA '04.