Hidden Markov models with first-order equalization for noisy speech recognition

A particularly effective distortion measure that takes into account the norm shrinkage bias in the noisy cepstrum is considered. A first-order equalization mechanism, specifically aiming at avoiding the norm shrinkage problem, is incorporated in a hidden Markov model (HMM) framework to model the speech cepstral sequence. Such a modeling technique requires special care, as the formulation inevitably involves parameter estimation from a set of data with singular dispersion. Solutions to this HMM stochastic modeling problem are provided, and algorithms for estimating the necessary model parameters are given. It is experimentally shown that incorporation of the first-order mean equalization model makes the HMM-based speech recognizer robust to noise. With respect to a conventional HMM recognizer, this leads to an improvement in recognition performance which is equivalent to a gain of about 15-20 dB in signal-to-noise ratio. >

[1]  Biing-Hwang Juang,et al.  On the use of bandpass liftering in speech recognition , 1987, IEEE Trans. Acoust. Speech Signal Process..

[2]  Yariv Ephraim,et al.  A linear predictive front-end processor for speech recognition in noisy environments , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  L. Baum,et al.  A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains , 1970 .

[4]  T. Martin,et al.  On the effects of varying filter bank parameters on isolated word recognition , 1982 .

[5]  Biing-Hwang Juang,et al.  A family of distortion measures based upon projection operation for robust speech recognition , 1989, IEEE Trans. Acoust. Speech Signal Process..

[6]  Kuldip K. Paliwal,et al.  Recognition of noisy speech using cumulant-based linear prediction analysis , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.