A new algorithm for fast discriminative training

Currently, almost all discriminative training algorithms for nonlinear classifier design are based on gradient-descent methods, such as the backpropagation and the generalized probabilistic descent algorithm. These algorithms are easy to derive and effective in applications. However, a drawback for the gradient-descent approaches is the slow training speed, which limits their applications in large training problems, such as large vocabulary speech recognition and other applications. For hidden Markov models, some training algorithms, such as the reestimation (or expectation-maximization) algorithm for maximum likelihood estimation (MLE), are fast, but they are not readily extendible to discriminative training for recognition performance improvements. To address the problem, we proposed a fast discriminative training algorithm in this paper. It is a batch-mode algorithm derived for the objective function of minimal error rate. The significant advantage is its closed-form solution for parameter estimation during iterations, instead of incremental search in the direction of gradient, as conventionally done. We experimentally show that the algorithm requires only a few iterations to achieve the optimization objective and that the estimated results lead to better recognition performance than a traditional MLE.

[1]  Biing-Hwang Juang,et al.  Discriminative learning for minimum error classification [pattern recognition] , 1992, IEEE Trans. Signal Process..

[2]  Qi Li,et al.  Principal feature classification , 1997, IEEE Trans. Neural Networks.

[3]  Paul J. Werbos,et al.  The roots of backpropagation , 1994 .

[4]  Raymond L. Watrous Current status of Peterson-Barney vowel formant data. , 1991, The Journal of the Acoustical Society of America.

[5]  David G. Stork,et al.  Pattern Classification , 1973 .