The minimum consistent DFA problem cannot be approximated within any polynomial

The minimum consistent DFA problem is that of finding a DFA with as few states as possible that is consistent with a given sample (a finite collection of words, each labeled as to whether the DFA found should accept or reject). Assuming that P <inline-equation> <f> ≠</f> </inline-equation> NP, it is shown that for any constant <italic>k</italic>, no polynomial-time algorithm can be guaranteed to find a consistent DFA with fewer than <italic>opt<supscrpt>k</supscrpt></italic> states, where <italic>opt</italic> is the number of states in the minimum state DFA consistent with the sample. This result holds even if the alphabet is of constant size two, and if the algorithm is allowed to produce an NFA, a regular expression, or a regular grammar that is consistent with the sample. A similar nonapproximability result is presented for the problem of finding small consistent linear grammars. For the case of finding minimum consistent DFAs when the alphabet is not of constant size but instead is allowed to vary with the problem specification, the slightly stronger lower bound on approximability of <italic>opt</italic><supscrpt>(1-ε)log log<italic>opt</italic></supscrpt> is shown for any ε > 0.

[1]  Dana Angluin,et al.  Learning Regular Sets from Queries and Counterexamples , 1987, Inf. Comput..

[2]  E. Mark Gold,et al.  Complexity of Automaton Identification from Given Data , 1978, Inf. Control..

[3]  Leonard Pitt,et al.  On the necessity of Occam algorithms , 1990, STOC '90.

[4]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[5]  Thomas J. Schaefer,et al.  The complexity of satisfiability problems , 1978, STOC.

[6]  Leslie G. Valiant,et al.  Cryptographic Limitations on Learning Boolean Formulae and Finite Automata , 1993, Machine Learning: From Theory to Applications.

[7]  David S. Johnson,et al.  The Complexity of Near-Optimal Graph Coloring , 1976, J. ACM.

[8]  Jeffrey D. Ullman,et al.  Introduction to Automata Theory, Languages and Computation , 1979 .

[9]  David Haussler,et al.  Equivalence of models for polynomial learnability , 1988, COLT '88.

[10]  Temple F. Smith Occam's razor , 1980, Nature.

[11]  Umesh V. Vazirani,et al.  On the learnability of finite automata , 1988, Annual Conference Computational Learning Theory.

[12]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[13]  Desh Ranjan,et al.  Quantifiers and approximation , 1990, STOC '90.

[14]  Desh Ranjan,et al.  Quantifiers and Approximation , 1993, Theor. Comput. Sci..

[15]  DANA ANGLUIN,et al.  On the Complexity of Minimum Inference of Regular Sets , 1978, Inf. Control..

[16]  Andrzej Ehrenfeucht,et al.  Complexity measures for regular expressions , 1974, STOC '74.

[17]  Andrzej Ehrenfeucht,et al.  Complexity Measures for Regular Expressions , 1976, J. Comput. Syst. Sci..

[18]  Dominique Perrin,et al.  Finite Automata , 1958, Philosophy.

[19]  Dana Charmian Angluin,et al.  An application of the theory of computational complexity to the study of inductive inference. , 1976 .

[20]  Manfred K. Warmuth,et al.  Learning integer lattices , 1990, COLT '90.

[21]  Mihalis Yannakakis,et al.  Optimization, approximation, and complexity classes , 1991, STOC '88.

[22]  Leslie G. Valiant,et al.  Computational limitations on learning from examples , 1988, JACM.