s Multiple-State Context-Dependent Phonetic Modeling with MLP

arlier hybrid multilayer perceptron (MLP)/hidden Markov model (HMM) continuous speech recognition sys- r g tems have not modeled context-dependent phonetic effects, sequences of distributions for phonetic models, o ender-based speech consistencies. In this paper we present a new MLP architecture and training procedure for t " modeling context-dependent phonetic classes with a sequence of distributions. A new training procedure tha smooths" networks with different degrees of context-dependence is proposed in order to obtain a robust esti- p mate of the context-dependent probabilities. We have used this new architecture to model generalized biphone honetic contexts. Tests with the speaker-independent DARPA Resource Management database have shown d w average reductions in word error rates of 20% in both the word-pair grammar and no-grammar cases, compare ith our earlier context-independent MLP/HMM hybrid.

[1]  B. Juang,et al.  Context-dependent Phonetic Hidden Markov Models for Speaker-independent Continuous Speech Recognition , 2008 .

[2]  Hervé Bourlard,et al.  CDNN: a context dependent neural network for continuous speech recognition , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  Mitch Weintraub,et al.  SRI's DECIPHER System , 1989, HLT.

[4]  Frederick Jelinek,et al.  Continuous speech recognition , 1977, SGAR.

[5]  Hervé Bourlard,et al.  Continuous speech recognition using multilayer perceptrons with hidden Markov models , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[6]  Jong Kyoung Kim,et al.  Speech recognition , 1983, 1983 IEEE International Solid-State Circuits Conference. Digest of Technical Papers.

[7]  L. R. Rabiner,et al.  An introduction to the application of the theory of probabilistic functions of a Markov process to automatic speech recognition , 1983, The Bell System Technical Journal.

[8]  John Makhoul,et al.  Context-dependent modeling for acoustic-phonetic recognition of continuous speech , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[9]  Kay-Fu Lee,et al.  Context-dependent phonetic hidden Markov models for speaker-independent continuous speech recognition , 1990, IEEE Trans. Acoust. Speech Signal Process..

[10]  Kai-Fu Lee,et al.  Context-independent phonetic hidden Markov models for speaker-independent continuous speech recognition , 1990 .

[11]  Steve Renals,et al.  Connectionist probability estimation in the DECIPHER speech recognition system , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[12]  Yochai Konig,et al.  Connectionist gender adaptation in a hybrid neural network / hidden Markov model speech recognition system , 1992, ICSLP.

[13]  Frederick Jelinek,et al.  Interpolated estimation of Markov source parameters from sparse data , 1980 .

[14]  Jack Perkins,et al.  Pattern recognition in practice , 1980 .

[15]  Yochai Konig,et al.  GDNN: a gender-dependent neural network for continuous speech recognition , 1992, [Proceedings 1992] IJCNN International Joint Conference on Neural Networks.