论文信息 - Context-Dependent Multiple Distribution Phonetic Modeling with MLPs

Context-Dependent Multiple Distribution Phonetic Modeling with MLPs

A number of hybrid multilayer perceptron (MLP)/hidden Markov model (HMM) speech recognition systems have been developed in recent years (Morgan and Bourlard, 1990). In this paper, we present a new MLP architecture and training algorithm which allows the modeling of context-dependent phonetic classes in a hybrid MLP/HMM framework. The new training procedure smooths MLPs trained at different degrees of context dependence in order to obtain a robust estimate of the context-dependent probabilities. Tests with the DARPA Resource Management database have shown substantial advantages of the context-dependent MLPs over earlier context-independent MLPs, and have shown substantial advantages of this hybrid approach over a pure HMM approach.

[1] Frederick Jelinek,et al. Interpolated estimation of Markov source parameters from sparse data , 1980 .

[2] L. R. Rabiner,et al. An introduction to the application of the theory of probabilistic functions of a Markov process to automatic speech recognition , 1983, The Bell System Technical Journal.

[3] John Makhoul,et al. Context-dependent modeling for acoustic-phonetic recognition of continuous speech , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .

[5] Mitch Weintraub,et al. The decipher speech recognition system , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[6] Hervé Bourlard,et al. Continuous speech recognition using multilayer perceptrons with hidden Markov models , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[7] Steve Renals,et al. Connectionist probability estimation in the DECIPHER speech recognition system , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[8] Jeff A. Bilmes,et al. The Ring Array Processor: A Multiprocessing Peripheral for Connection Applications , 1992, J. Parallel Distributed Comput..

[9] N. Morgan,et al. PROBABILITY ESTIMATION IN THE DECIPHER SPEECH RECOGNITION SYSTEM , 1992 .

[10] Hervé Bourlard,et al. CDNN: a context dependent neural network for continuous speech recognition , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.