Hidden Markov models, maximum mutual information estimation, and the speech recognition problem

Hidden Markov Models (HMMs) are one of the most powerful speech recognition tools available today. Even so, the inadequacies of HMMs as a "correct" modeling framework for speech are well known. In that context, we argue that the maximum mutual information estimation (MMIE) formulation for training is more appropriate vis-a-vis maximum likelihood estimation (MLE) for reducing the error rate. We also show how MMIE paves the way for new training possibilities. We introduce Corrective MMIE training, a very efficient new training algorithm which uses a modified version of a discrete reestimation formula recently proposed by Gopalakrishnan et al. We propose reestimation formulas for the case of diagonal Gaussian densities, experimentally demonstrate their convergence properties, and integrate them into our training algorithm. In a connected digit recognition task, MMIE consistently improves the recognition performance of our recognizer.

[1]  Jr. G. Forney,et al.  The viterbi algorithm , 1973 .

[2]  R. Gray,et al.  Vector quantization , 1984, IEEE ASSP Magazine.