Word recognition with recurrent network automata

The authors report a method to directly encode temporal information into a neural network by explicitly modeling that information with a left-to-right automaton, and teaching a recurrent network to identify the automaton states. The state length and position are adjusted with the usual train and re-segment iterative procedure. The global model is a hybrid of a recurrent neural network which implements the state transition models, and dynamic programming, which finds the best state sequence. The advantages achieved by using recurrent networks are outlined by applying the method to a speaker-independent digit recognition task.<<ETX>>

[1]  H. Hackbarth,et al.  Scaly artificial neural networks for speaker-independent recognition of isolated words , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[2]  Christopher L. Scofield,et al.  Neural networks and speech processing , 1991, The Kluwer international series in engineering and computer science.

[3]  A. Waibel,et al.  Connectionist Viterbi training: a new hybrid method for continuous speech recognition , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[4]  Alex Waibel,et al.  Integrating time alignment and neural networks for high performance continuous speech recognition , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[5]  Ken-ichi Iso,et al.  Speaker-independent word recognition using a neural prediction model , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[6]  James L. McClelland,et al.  Learning Subsequential Structure in Simple Recurrent Networks , 1988, NIPS.

[7]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[8]  R. Gemello,et al.  An enhancement to MLP model to enforce closed decision regions , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.

[9]  Alex Waibel,et al.  Continuous speech recognition using linked predictive neural networks , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[10]  H. Bourlard,et al.  Links Between Markov Models and Multilayer Perceptrons , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  W. Luz,et al.  Speech recognition using a sequential neural network , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.