Max-Gabor analysis and synthesis of spectrograms

We present a method that analyzes a two-dimensional magnitude spectrogram S(f, t) into its local constituent spectro-temporal amplitudes A(f, t), frequencies F (f, t), orientations Θ(f, t), and phases φ(f, t). The method operates by performing a twodimensional local Gabor-like analysis of the spectrogram, retaining only the parameters of the 2D-Gabor filter with maximal amplitude response within the local region. We demonstrate the technique over a wide variety of speakers, and show how the spectrograms in each case may be adequately reconstructed using the parameters of the Max-Gabor analysis. Finally, we discuss the nature of the extracted Max-Gabor parameters.

[1]  Thomas F. Quatieri,et al.  Speech analysis/Synthesis based on a sinusoidal representation , 1986, IEEE Trans. Acoust. Speech Signal Process..

[2]  Wilson S. Geisler,et al.  Multichannel Texture Analysis Using Localized Spatial Filters , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  T. Poggio,et al.  Hierarchical models of object recognition in cortex September 23 , 1999 , 1999 .

[4]  Nima Mesgarani,et al.  Speech discrimination based on multiscale spectro-temporal modulations , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  Powen Ru,et al.  Multiresolution spectrotemporal analysis of complex sounds. , 2005, The Journal of the Acoustical Society of America.

[6]  Venu Govindaraju,et al.  Fingerprint Image Enhancement Using STFT Analysis , 2005, ICAPR.