Learning and Extracting Finite State Automata with Second-Order Recurrent Neural Networks

We show that a recurrent, second-order neural network using a real-time, forward training algorithm readily learns to infer small regular grammars from positive and negative string training samples. We present simulations that show the effect of initial conditions, training set size and order, and neural network architecture. All simulations were performed with random initial weight strengths and usually converge after approximately a hundred epochs of training. We discuss a quantization algorithm for dynamically extracting finite state automata during and after training. For a well-trained neural net, the extracted automata constitute an equivalence class of state machines that are reducible to the minimal machine of the inferred grammar. We then show through simulations that many of the neural net state machines are dynamically stable, that is, they correctly classify many long unseen strings. In addition, some of these extracted automata actually outperform the trained neural network for classification of unseen strings.

[1]  David G. Stork,et al.  Pattern Classification , 1973 .

[2]  E. Mark Gold,et al.  Complexity of Automaton Identification from Given Data , 1978, Inf. Control..

[3]  Jeffrey D. Ullman,et al.  Introduction to Automata Theory, Languages and Computation , 1979 .

[4]  O. Firschein,et al.  Syntactic pattern recognition and applications , 1983, Proceedings of the IEEE.

[5]  Carl H. Smith,et al.  Inductive Inference: Theory and Methods , 1983, CSUR.

[6]  C. L. Giles,et al.  Machine learning using higher order correlation networks , 1986 .

[7]  Ronald L. Rivest,et al.  Diversity-based inference of finite automata , 1994, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[8]  Michael A. Arbib,et al.  An Introduction to Formal Language Theory , 1988, Texts and Monographs in Computer Science.

[9]  C. Lee Giles,et al.  Higher Order Recurrent Networks and Grammatical Inference , 1989, NIPS.

[10]  James L. McClelland,et al.  Finite State Automata and Simple Recurrent Networks , 1989, Neural Computation.

[11]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[12]  Michael C. Mozer,et al.  Discovering the Structure of a Reactive Environment by Exploration , 1990, Neural Computation.

[13]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[14]  Anders Krogh,et al.  Introduction to the theory of neural computation , 1994, The advanced book program.

[15]  C. L. Giles,et al.  Heuristics for the extraction of rules from discrete-time recurrent neural networks , 1992, [Proceedings 1992] IJCNN International Joint Conference on Neural Networks.

[16]  Raymond L. Watrous,et al.  Induction of Finite-State Languages Using Second-Order Recurrent Networks , 1992, Neural Computation.

[17]  K. Lindgren,et al.  Regular language inference using evolving neural networks , 1992, [Proceedings] COGANN-92: International Workshop on Combinations of Genetic Algorithms and Neural Networks.

[18]  Padhraic Smyth,et al.  Learning Finite State Machines With Self-Clustering Recurrent Networks , 1993, Neural Computation.

[19]  D.R. Hush,et al.  Progress in supervised neural networks , 1993, IEEE Signal Processing Magazine.

[20]  C. Lee Giles,et al.  Extraction, Insertion and Refinement of Symbolic Rules in Dynamically Driven Recurrent Neural Networks , 1993 .

[21]  Padhraic Smyth,et al.  Discrete recurrent neural networks for grammatical inference , 1994, IEEE Trans. Neural Networks.