Iterative Probabilistic Tree Search for the Minimum Common String Partition Problem

The minimum common string partition problem is an NP-hard combinatorial optimization problem with applications in computational biology. In this work we propose an iterative probabilistic tree search algorithm for tackling this problem. By means of an extensive experimental evaluation we show the superiority of our approach in comparison to a standard greedy algorithm and a metaheuristic based on ant colony optimization from the related literature.

[1]  Marek Chrobak,et al.  The greedy algorithm for the minimum common string partition problem , 2005, TALG.

[2]  Dan Gusfield,et al.  Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[3]  Sayyed Rasoul Mousavi,et al.  An improved heuristic for the far from most strings problem , 2012, J. Heuristics.

[4]  Dana Shapira,et al.  Edit distance with move operations , 2002, J. Discrete Algorithms.

[5]  M. W. Du,et al.  Computing a longest common subsequence for a set of strings , 1984, BIT.

[6]  Oliver Eulenstein,et al.  Bioinformatics Research and Applications , 2008 .

[7]  Binhai Zhu,et al.  Combinatorial Optimization and Applications , 2014, Lecture Notes in Computer Science.

[8]  Dan He,et al.  A Novel Greedy Algorithm for the Minimum Common String Partition Problem , 2007, ISBRA.

[9]  Dan Gusfield,et al.  Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[10]  Petr Kolman,et al.  Minimum Common String Partition Problem: Hardness and Approximations , 2004, Electron. J. Comb..

[11]  Christian Blum,et al.  Hybrid metaheuristics in combinatorial optimization: A survey , 2011, Appl. Soft Comput..

[12]  Moshe Lewenstein,et al.  Quick Greedy Computation for Minimum Common String Partitions , 2011, CPM.

[13]  Peter Damaschke,et al.  Minimum Common String Partition Parameterized , 2008, WABI.

[14]  Petr Kolman Approximating Reversal Distance for Strings with Bounded Number of Duplicates , 2005, MFCS.

[15]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[16]  Graham Cormode,et al.  The string edit distance matching problem with moves , 2002, SODA '02.

[17]  Ying Tan,et al.  Advances in Swarm Intelligence , 2016, Lecture Notes in Computer Science.

[18]  Michel Rigo,et al.  Abstract numeration systems and tilings , 2005 .

[19]  Mauricio G. C. Resende,et al.  Grasp: An Annotated Bibliography , 2002 .

[20]  Hong Zhu,et al.  Minimum common string partition revisited , 2010, J. Comb. Optim..

[21]  Tomasz Waleń,et al.  Reversal Distance for Strings with Duplicates: Linear Time Approximation using Hitting Set , 2007 .

[22]  Tao Jiang,et al.  Computing the Assignment of Orthologous Genes via Genome Rearrangement , 2005, APBC.

[23]  Sorin C. Popescu,et al.  Lidar Remote Sensing , 2011 .

[24]  Rita Casadio,et al.  Algorithms in Bioinformatics, 5th International Workshop, WABI 2005, Mallorca, Spain, October 3-6, 2005, Proceedings , 2005, WABI.

[25]  P. Pardalos,et al.  Optimization techniques for string selection and comparison problems in genomics , 2005, IEEE Engineering in Medicine and Biology Magazine.

[26]  D. Eppstein Foreword to special issue on SODA 2002 , 2007, TALG.

[27]  Haim Kaplan,et al.  The greedy algorithm for edit distance with moves , 2006, Inf. Process. Lett..

[28]  Mohammad Sohel Rahman,et al.  Solving the Minimum Common String Partition Problem with the Help of Ants , 2013, ICSI.

[29]  Bin Fu,et al.  Exponential and Polynomial Time Algorithms for the Minimum Common String Partition Problem , 2011, COCOA.