Maximal Common Subsequences and Minimal Common Supersequences

The problems of finding a longest common subsequence and a shortest common supersequence of a set of strings are well known. They can be solved in polynomial time for two strings (in fact the problems are dual in this case), or for any fixed number of strings, by dynamic programming. But both problems are NP-hard in general for an arbitrary numberkof strings. Here we study the related problems of finding a shortest maximal common subsequence and a longest minimal common supersequence. We describe dynamic programming algorithms for the case of two strings (for which case the problems are no longer dual), which can be extended to any fixed number of strings. We also show that both problems are NP-hard in general forkstrings, although the latter problem, unlike shortest common supersequence, is solvable in polynomial time for strings of length 2. Finally, we prove a strong negative approximability result for the shortest maximal common subsequence problem.

[1]  Mihalis Yannakakis,et al.  Edge Dominating Sets in Graphs , 1980 .

[2]  David Maier,et al.  The Complexity of Some Problems on Subsequences and Supersequences , 1978, JACM.

[3]  Esko Ukkonen,et al.  The Shortest Common Supersequence Problem over Binary Alphabet is NP-Complete , 1981, Theor. Comput. Sci..

[4]  Mihalis Yannakakis,et al.  Optimization, approximation, and complexity classes , 1991, STOC '88.

[5]  Daniel S. Hirschberg,et al.  Algorithms for the Longest Common Subsequence Problem , 1977, JACM.

[6]  Magnús M. Halldórsson,et al.  Approximating the Minimum Maximal Independence Number , 1993, Inf. Process. Lett..

[7]  Thomas G. Szymanski,et al.  A fast algorithm for computing longest common subsequences , 1977, CACM.

[8]  Tao Jiang,et al.  On the Approximation of Shortest Common Supersequences and Longest Common Subsequences , 1994, SIAM J. Comput..

[9]  Esko Ukkonen,et al.  Algorithms for Approximate String Matching , 1985, Inf. Control..

[10]  Martin Middendorf Zur Komplexität von Einbettungsproblemen für Wortmengen , 1992 .

[11]  Robert W. Irving On Approximating the Minimum Independent Dominating Set , 1991, Inf. Process. Lett..

[12]  V. G. Timkovskii Complexity of common subsequence and supersequence problems and related problems , 1989 .

[13]  Eugene W. Myers,et al.  An O(NP) Sequence Comparison Algorithm , 1990, Inf. Process. Lett..

[14]  Daniel S. Hirschberg,et al.  A linear space algorithm for computing maximal common subsequences , 1975, Commun. ACM.

[15]  Tao Jiang,et al.  On the Approximation of Shortest Common Supersequences and Longest Common Subsequences , 1995, SIAM J. Comput..