Feature Transformation and Model Design Using Minimum Classification Error

A Minimum Classification Error (MCE) based recognition system that also estimates a global feature transformation matrix has been implemented. Unlike earlier studies, we make the explicit assumption that the covariance matrix of the Gaussian mixtures is diagonal when estimating the transformation matrix. This is necessary for mathematical consistency between the model and the transformation matrix estimates. Experimental results show a reduction of up to 50% in the word error rate as compared to Maximum Likelihood estimation.

[1]  Jonathan Le Roux,et al.  Discriminative Training for Large-Vocabulary Speech Recognition Using Minimum Classification Error , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[2]  Biing-Hwang Juang,et al.  An application of discriminative feature extraction to filter-bank-based speech recognition , 2001, IEEE Trans. Speech Audio Process..

[3]  Antonio M. Peinado,et al.  Discriminative feature weighting for HMM-based continuous speech recognizers , 2002, Speech Commun..

[4]  Li Deng,et al.  HMM-based speech recognition using state-dependent, discriminatively derived transforms on mel-warped DFT features , 1997, IEEE Trans. Speech Audio Process..

[5]  Andreas G. Andreou,et al.  Investigation of silicon auditory models and generalization of linear discriminant analysis for improved speech recognition , 1997 .

[6]  Abeer Alwan,et al.  Noise robust feature extraction for ASR using the Aurora 2 database , 2001, INTERSPEECH.

[7]  N. Campbell CANONICAL VARIATE ANALYSIS—A GENERAL MODEL FORMULATION , 1984 .

[8]  Panu Somervuo,et al.  Experiments with linear and nonlinear feature transformations in HMM based phone recognition , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[9]  E. McDermott,et al.  Minimum classification error via a Parzen window based estimate of the theoretical Bayes classification risk , 2002, Proceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing.

[10]  Kuldip K. Paliwal,et al.  Feature extraction and dimensionality reduction algorithms and their applications in vowel recognition , 2003, Pattern Recognit..

[11]  Antonio M. Peinado,et al.  An application of minimum classification error to feature space transformations for speech recognition , 1996, Speech Commun..

[12]  Frank K. Soong,et al.  High performance connected digit recognition using hidden Markov models , 1989, IEEE Trans. Acoust. Speech Signal Process..

[13]  Biing-Hwang Juang,et al.  Minimum classification error rate methods for speech recognition , 1997, IEEE Trans. Speech Audio Process..

[14]  Mark J. F. Gales,et al.  Semi-tied covariance matrices for hidden Markov models , 1999, IEEE Trans. Speech Audio Process..

[15]  Alain Biem,et al.  Feature extraction based on minimum classification error/generalized probabilistic descent method , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[16]  Ramesh A. Gopinath,et al.  Maximum likelihood modeling with Gaussian distributions for classification , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[17]  David Pearce,et al.  The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions , 2000, INTERSPEECH.

[18]  John H. L. Hansen,et al.  Robust digit recognition in noise: an evaluation using the AURORA corpus , 2001, INTERSPEECH.