Multi-State Time Delay Neural Networks for Continuous Speech Recognition

Alex Waibel Carnegie Mellon University Pittsburgh, PA 15213 ahw@cs.cmu.edu We present the "Multi-State Time Delay Neural Network" (MS-TDNN) as an extension of the TDNN to robust word recognition. Unlike most other hybrid methods. the MS-TDNN embeds an alignment search procedure into the connectionist architecture. and allows for word level supervision. The resulting system has the ability to manage the sequential order of subword units. while optimizing for the recognizer performance. In this paper we present extensive new evaluations of this approach over speaker-dependent and speaker-independent connected alphabet.