论文信息 - A Recurrent Neural Network that Learns to Count

A Recurrent Neural Network that Learns to Count

Parallel distributed processing (PDP) architectures demonstrate a potentially radical alternative to the traditional theories of language processing that are based on serial computational models. However, learning complex structural relationships in temporal data presents a serious challenge to PDP systems. For example, automata theory dictates that processing strings from a context-free language (CFL) requires a stack or counter memory device. While some PDP models have been hand-crafted to emulate such a device, it is not clear how a neural network might develop such a device when learning a CFL. This research employs standard backpropagation training techniques for a recurrent neural network (RNN) in the task of learning to predict the next character in a simple deterministic CFL (DCFL). We show that an RNN can learn to recognize the structure of a simple DCFL. We use dynamical systems theory to identify how network states reflect that structure by building counters in phase space. The work is an empirical investigation which is complementary to theoretical analyses of network capabilities, yet original in its specific configuration of dynamics involved. The application of dynamical systems theory helps us relate the simulation results to theoretical results, and the learning task enables us to highlight some issues for understanding dynamical systems that process language with counters.

Paul Rodríguez | P. Rodríguez

[1] Jeffrey D. Ullman,et al. Introduction to Automata Theory, Languages and Computation , 1979 .

[2] M. Kutas,et al. Brain potentials during reading reflect word expectancy and semantic association , 1984, Nature.

[3] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .

[4] James L. McClelland,et al. Mechanisms of Sentence Processing: Assigning Roles to Constituents of Sentences , 1986 .

[5] Robert C. Berwick,et al. Computational complexity and natural language , 1987 .

[6] M. Coltheart,et al. Attention and Performance XII: The Psychology of Reading , 1987 .

[7] James L. McClelland. The Case for Interactionism in Language Processing. , 1987 .

[8] James L. McClelland,et al. Learning Subsequential Structure in Simple Recurrent Networks , 1988, NIPS.

[9] Ronald J. Williams,et al. Experimental Analysis of the Real-time Recurrent Learning Algorithm , 1989 .

[10] David Zipser,et al. Learning Sequential Structure with the Real-Time Recurrent Learning Algorithm , 1991, Int. J. Neural Syst..

[11] S. Wiggins. Introduction to Applied Nonlinear Dynamical Systems and Chaos , 1989 .