Speech recognizers often experience serious performance degradation when deployed in an unknown acoustic (particularly, noise contaminated) environment. To combat this problem, the authors proposed in a previous study a distortion measure that takes into account the norm shrinkage bias in the noisy cepstrum. The authors incorporate a first-order equalization mechanism, specifically aimed at avoiding the norm shrinkage problem, in a hidden Markov model (HMM) framework to model the speech cepstral sequence. Such a modeling technique requires special care as the formulation inevitably involves parameter estimation from a set of data with singular dispersion. The authors provide solutions to this HMM stochastic modeling problem and give algorithms for estimating the necessary model parameters. They experimentally show that incorporation of the first-order normal equalization model makes the HMM-based speech recognizer robust to noise. With respect to a conventional HMM recognizer, this leads to an improvement in recognition performance which is equivalent to about 15-20-dB gain in signal-to-noise ratio.<<ETX>>
[1]
Biing-Hwang Juang,et al.
On the use of bandpass liftering in speech recognition
,
1987,
IEEE Trans. Acoust. Speech Signal Process..
[2]
D. G. Kabe.
SOME RESULTS FOR THE NORMAL MULTIVARIATE REGRESSION MODELS
,
1966
.
[3]
Yariv Ephraim,et al.
A linear predictive front-end processor for speech recognition in noisy environments
,
1987,
ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.
[4]
Biing-Hwang Juang,et al.
A family of distortion measures based upon projection operation for robust speech recognition
,
1989,
IEEE Trans. Acoust. Speech Signal Process..
[5]
L. Baum,et al.
A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains
,
1970
.
[6]
T. Martin,et al.
On the effects of varying filter bank parameters on isolated word recognition
,
1982
.