论文信息 - Monaural Speech Separation by Support Vector Machines: Bridging the Divide Between Supervised and Unsupervised Learning Methods

Monaural Speech Separation by Support Vector Machines: Bridging the Divide Between Supervised and Unsupervised Learning Methods

We address the problem of identifying multiple independent speech sources from a single signal that is a mixture of the sources. Because the problem is ill-posed, standard independent component analysis (ICA) approaches which try to invert the mixing matrix fail. We show how the unsupervised problem can be transformed into a supervised regression task which is then solved by support- vector regression (SVR). It turns out that the linear SVR approach is equivalent to the sparse-decomposition method proposed by (1, 2). However, we can extend the method to nonlinear ICA by applying the "kernel trick." Beyond the kernel trick, the SVM perspective provides a new interpretation of the sparse-decomposition method's hyperparameter which is related to the input noise. The limitation of the SVM perspective is that, for the nonlinear case, it can recover only whether or not a mixture component is present; it cannot recover the strength of the component. In experiments, we show that our model can handle difficult problems and is especially well suited for speech signal separation.

Michael C. Mozer | Sepp Hochreiter | S. Hochreiter | M. Mozer

[1] Corinna Cortes,et al. Support-Vector Networks , 1995, Machine Learning.

[2] Michael Scholz,et al. Nonlinear Filtering of Electron Micrographs by Means of Support Vector Regression , 2003, NIPS.

[3] Barak A. Pearlmutter,et al. Blind Source Separation by Sparse Decomposition in a Signal Dictionary , 2001, Neural Computation.

[4] Gert Cauwenberghs,et al. Monaural separation of independent acoustical components , 1999, ISCAS'99. Proceedings of the 1999 IEEE International Symposium on Circuits and Systems VLSI (Cat. No.99CH36349).

[5] Toshiyuki Tanaka. Analysis of Bit Error Probability of Direct-Sequence CDMA Multiuser Demodulators , 2000, NIPS.

[6] Aapo Hyvärinen,et al. Survey on Independent Component Analysis , 1999 .

[7] Barak A. Pearlmutter,et al. Monaural Source Separation Using Spectral Cues , 2004, ICA.

[8] J. Cardoso,et al. Blind beamforming for non-gaussian signals , 1993 .

[9] Terrence J. Sejnowski,et al. Learning Nonlinear Overcomplete Representations for Efficient Coding , 1997, NIPS.

[10] Hagai Attias,et al. Blind Source Separation and Deconvolution: The Dynamic Component Analysis Algorithm , 1998, Neural Computation.

[11] R. C. Williamson,et al. Support vector regression with automatic accuracy control. , 1998 .

[12] Terrence J. Sejnowski,et al. Blind source separation of more sources than mixtures using overcomplete representations , 1999, IEEE Signal Processing Letters.

[13] Barak A. Pearlmutter,et al. Maximum Likelihood Blind Source Separation: A Context-Sensitive Generalization of ICA , 1996, NIPS.

[14] Sam T. Roweis,et al. One Microphone Source Separation , 2000, NIPS.

[15] Andrzej Cichocki,et al. A New Learning Algorithm for Blind Signal Separation , 1995, NIPS.

[16] Pierre Comon,et al. Independent component analysis, A new concept? , 1994, Signal Process..