Robust blind dereverberation of speech signals based on characteristics of short-time speech segments

This paper addresses blind dereverberation techniques based on the inherent characteristics of speech signals. Two challenging issues for speech dereverberation involve decomposing reverberant observed signals into colored sources and room transfer functions (RTFs), and making the inverse filtering robust as regards acoustic and system noise. We show that short-time speech characteristics are very important for this task, and that multi-channel linear prediction (MCLP) is a useful tool for achieving robust inverse filtering. As examples, we detail three recently proposed robust dereverberation methods. By assuming the source to be a small order autoregressive process, we can present an efficient source estimation method that reduces late reverberation reflections of the reverberation using multi-step linear prediction. By exploiting the time-varying nature of the speech signals, we can also develop a method that can estimate both the source and the inverse filters of the RTFs. Furthermore, we can achieve high quality speech dereverberation by formulating the problem as a likelihood maximization problem using a statistical speech model that represents the spectral characteristics of short-time speech segments including harmonicity.

[1]  Masato Miyoshi ESTIMATING AR PARAMETER-SETS FOR LINEAR-RECURRENT SIGNALS IN CONVOLUTIVE MIXTURES , 2003 .

[2]  Gary W. Elko,et al.  Superdirectional microphone arrays , 2000 .

[3]  Tomohiro Nakatani,et al.  Spectral Subtraction Steered by Multi-Step Forward Linear Prediction For Single Channel Speech Dereverberation , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[4]  T. Hikichi,et al.  Blind dereverberation based on estimates of signal transmission channels without precise information on channel order [speech processing applications] , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[5]  Biing-Hwang Juang,et al.  Speech Dereverberation Based on Probabilistic Models of Source and Room Acoustics , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[6]  Masato Miyoshi,et al.  Inverse filtering of room acoustics , 1988, IEEE Trans. Acoust. Speech Signal Process..

[7]  Chrysostomos L. Nikias,et al.  EVAM: an eigenvector-based algorithm for multichannel blind deconvolution of input colored signals , 1995, IEEE Trans. Signal Process..

[8]  Marc Delcroix,et al.  On robust inverse filter design for room transfer function fluctuations , 2006, 2006 14th European Signal Processing Conference.

[9]  Les E. Atlas,et al.  Strategies for improving audible quality and speech recognition accuracy of reverberant speech , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[10]  Takuya Yoshioka,et al.  Robust decomposition of inverse filter of channel and prediction error filter of speech signal for dereverberation , 2006, 2006 14th European Signal Processing Conference.

[11]  Biing-Hwang Juang,et al.  Study on Speech Dereverberation with Autocorrelation Codebook , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[12]  J. Flanagan,et al.  Computer‐steered microphone arrays for sound transduction in large rooms , 1985 .

[13]  Masato Miyoshi,et al.  Blind dereverberation algorithm for speech signals based on multi-channel linear prediction , 2005 .

[14]  Jacob Benesty,et al.  Speech Acquisition and Enhancement in a Reverberant, Cocktail-Party-Like Environment , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[15]  Klaus Uwe Simmer,et al.  Superdirective Microphone Arrays , 2001, Microphone Arrays.

[16]  Marc Delcroix,et al.  Dereverberation of speech signals based on linear prediction , 2004, INTERSPEECH.