论文信息 - Speech recognition with continuous-parameter hidden Markov models

Speech recognition with continuous-parameter hidden Markov models

Abstract The acoustic modeling problem in automatic speech recognition is examined from an information-theoretic point of view. This problem is to design a speech-recognition system which can extract from the speech waveform as much information as possible about the corresponding word sequence. The information extraction process is broken down into two steps: a signal-processing step which converts a speech waveform into a sequence of information-bearing acoustic feature vectors, and a step which models such a sequence. We are primarily concerned with the use of hidden Markov models to model sequences of feature vectors which lie in a continuous space. We explore the trade-off between packing information into such sequences and being able to model them accurately. The difficulty of developing accurate models of continuous-parameter sequences is addressed by investigating a method of parameter estimation which is designed to cope with inaccurate modeling assumptions.

[1] H. P. Friedman,et al. On Some Invariant Criteria for Grouping Data , 1967 .

[2] Biing-Hwang Juang,et al. Mixture autoregressive hidden Markov models for speaker independent isolated word recognition , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3] A. Nadas,et al. A decision theorectic formulation of a training problem in speech recognition and a comparison of training by unconditional versus conditional maximum likelihood , 1983 .

[4] Peter F. Brown,et al. The acoustic-modeling problem in automatic speech recognition , 1987 .

[5] Geoffrey E. Hinton,et al. Learning and relearning in Boltzmann machines , 1986 .

[6] L. Rabiner,et al. The acoustics, speech, and signal processing society - A historical perspective , 1984, IEEE ASSP Magazine.

[7] A. Poritz,et al. On hidden Markov models in isolated word recognition , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[8] J. Makhoul,et al. Vector quantization in speech coding , 1985, Proceedings of the IEEE.

[9] Lalit R. Bahl,et al. A Maximum Likelihood Approach to Continuous Speech Recognition , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10] L. R. Rabiner,et al. Recognition of isolated digits using hidden Markov models with continuous mixture densities , 1985, AT&T Technical Journal.

[11] Frederick Jelinek,et al. The development of an experimental discrete dictation recognizer , 1985 .

[12] L. Baum,et al. An inequality and associated maximization technique in statistical estimation of probabilistic functions of a Markov process , 1972 .

[13] R. Gray,et al. Vector quantization , 1984, IEEE ASSP Magazine.

[14] Lalit R. Bahl,et al. Continuous speech recognition with automatically selected acoustic prototypes obtained by either bootstrapping or clustering , 1981, ICASSP.

[15] Lalit R. Bahl,et al. Continuous parameter acoustic processing for recognition of a natural speech corpus , 1981, ICASSP.

[16] Lalit R. Bahl,et al. Maximum mutual information estimation of hidden Markov model parameters for speech recognition , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.