A hybrid Maxent/HMM based ASR system

The aim of this work is to develop a practical framework, which extends the classical Hidden Markov Models (HMM) for continuous speech recognition based on the Maximum Entropy (MaxEnt) principle. The MaxEnt models can estimate the posterior probabilities directly as with Hybrid NN/HMM connectionist speech recognition systems. In particular, a new acoustic modelling based on discriminative MaxEnt models is formulated and is being developed to replace the generative Gaussian Mixture Models (GMM) commonly used to model acoustic variability. Initial experimental results using the TIMIT phone task are reported.

[1]  Marco Gori,et al.  A survey of hybrid ANN/HMM models for automatic speech recognition , 2001, Neurocomputing.

[2]  Ronald Rosenfeld,et al.  A maximum entropy approach to adaptive statistical language modelling , 1996, Comput. Speech Lang..

[3]  Hervé Bourlard,et al.  An introduction to the hybrid hmm/connectionist approach , 1995 .

[4]  E. Jaynes On the rationale of maximum-entropy methods , 1982, Proceedings of the IEEE.

[5]  Hermann Ney,et al.  A comparative study on maximum entropy and discriminative training for acoustic modeling in automatic speech recognition , 2003, INTERSPEECH.

[6]  Anthony J. Robinson,et al.  An application of recurrent nets to phone probability estimation , 1994, IEEE Trans. Neural Networks.

[7]  Neil D. Lawrence,et al.  Acoustic space dimensionality selection and combination using the maximum entropy principle , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[8]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[9]  John D. Lafferty,et al.  Inducing Features of Random Fields , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Adam L. Berger,et al.  A Maximum Entropy Approach to Natural Language Processing , 1996, CL.

[11]  V. Rich Personal communication , 1989, Nature.

[12]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[13]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[14]  Yuqing Gao,et al.  Direct models for phoneme recognition , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[15]  J. N. Kapur,et al.  Entropy optimization principles with applications , 1992 .

[16]  Silviu Guiasu,et al.  The principle of maximum entropy , 1985 .

[17]  J. Darroch,et al.  Generalized Iterative Scaling for Log-Linear Models , 1972 .