论文信息 - Learning Context Sensitive Languages with LSTM Trained with Kalman Filters

Learning Context Sensitive Languages with LSTM Trained with Kalman Filters

Unlike traditional recurrent neural networks, the Long Short-Term Memory (LSTM) model generalizes well when presented with training sequences derived from regular and also simple nonregular languages. Our novel combination of LSTM and the decoupled extended Kalman filter, however, learns even faster and generalizes even better, requiring only the 10 shortest exemplars (n ? 10) of the context sensitive language anbncn to deal correctly with values ofn up to 1000 and more. Even when we consider the relatively high update complexity per timestep, in many cases the hybrid offers faster learning than LSTM by itself.

Jürgen Schmidhuber | Douglas Eck | Juan Antonio Pérez-Ortiz | Felix A. Gers

[1] S. Haykin. Kalman Filtering and Neural Networks , 2001 .

[2] Jürgen Schmidhuber,et al. LSTM recurrent networks learn simple context-free and context-sensitive languages , 2001, IEEE Trans. Neural Networks.

[3] Janet Wiles,et al. Recurrent Neural Networks Can Learn to Implement Symbol-Sensitive Counting , 1997, NIPS.

[4] Paul Rodríguez,et al. A Recurrent Neural Network that Learns to Count , 1999, Connect. Sci..

[5] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[6] Janet Wiles,et al. Context-free and context-sensitive dynamics in recurrent neural networks , 2000, Connect. Sci..

[7] Lee A. Feldkamp,et al. Neurocontrol of nonlinear dynamical systems with Kalman filter trained recurrent networks , 1994, IEEE Trans. Neural Networks.

[8] Stephan K. Chalup,et al. Hill climbing in recurrent neural networks for learning the a/sup n/b/sup n/c/sup n/ language , 1999, ICONIP'99. ANZIIS'99 & ANNES'99 & ACNN'99. 6th International Conference on Neural Information Processing. Proceedings (Cat. No.99EX378).

[9] Jürgen Schmidhuber,et al. Learning to forget: continual prediction with LSTM , 1999 .