Speech enhancement using minimum mean-square error estimation and a post-filter derived from vector quantization of clean speech

In this paper, a novel post-filtering method applied after the logSTSA filter is proposed. Since the post-filter is derived from vector quantization of clean speech database, it has an equivalent effect of imposing clean source spectral constraints on the enhanced speech. When combined with the logSTSA filter, the additional filter can noticeably suppress residual artifacts by effectively lowering the residual white noise of decision-directed estimation as well as reducing the musical noise of maximum likelihood estimation. Compared to the logSTSA enhanced speech, the overall enhanced speech is able to raise the PESQ score by nearly half a point.

[1]  Renato De Mori,et al.  Automatic speech recognition with a modified Ephraim-Malah rule , 2006, IEEE Signal Processing Letters.

[2]  Michael T. Johnson,et al.  An improved SNR estimator for speech enhancement , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[3]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[4]  David Malah,et al.  Speech enhancement using a minimum mean-square error log-spectral amplitude estimator , 1984, IEEE Trans. Acoust. Speech Signal Process..

[5]  Biing-Hwang Juang,et al.  Mixture autoregressive hidden Markov models for speech signals , 1985, IEEE Trans. Acoust. Speech Signal Process..

[6]  Ephraim Speech enhancement using a minimum mean square error short-time spectral amplitude estimator , 1984 .

[7]  Y. Ephraim Statistical model-based speech enhancement systems , 1988 .