论文信息 - Multi-State Time Delay Neural Networks for Continuous Speech Recognition

Multi-State Time Delay Neural Networks for Continuous Speech Recognition

Alex Waibel Carnegie Mellon University Pittsburgh, PA 15213 ahw@cs.cmu.edu We present the "Multi-State Time Delay Neural Network" (MS-TDNN) as an extension of the TDNN to robust word recognition. Unlike most other hybrid methods. the MS-TDNN embeds an alignment search procedure into the connectionist architecture. and allows for word level supervision. The resulting system has the ability to manage the sequential order of subword units. while optimizing for the recognizer performance. In this paper we present extensive new evaluations of this approach over speaker-dependent and speaker-independent connected alphabet.

P. Haffner | A. Waibel

[1] Hermann Ney,et al. The use of a one-stage dynamic programming algorithm for connected word recognition , 1984 .

[2] Alexander H. Waibel,et al. Time-delay neural networks embedding time alignment: a performance analysis , 1991, EUROSPEECH.

[3] Alex Waibel,et al. Integrating time alignment and neural networks for high performance continuous speech recognition , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[4] Raj Reddy,et al. Large-vocabulary speaker-independent continuous speech recognition: the sphinx system , 1988 .

[5] John S. Bridle,et al. Alpha-nets: A recurrent 'neural' network architecture with a hidden Markov model interpretation , 1990, Speech Commun..

[6] Geoffrey E. Hinton,et al. Phoneme recognition using time-delay neural networks , 1989, IEEE Trans. Acoust. Speech Signal Process..

[7] Yoshua Bengio,et al. Artificial neural networks and their application to sequence recognition , 1991 .

[8] A. Waibel,et al. Connectionist Viterbi training: a new hybrid method for continuous speech recognition , 1990, International Conference on Acoustics, Speech, and Signal Processing.