The 2011 Signal Separation Evaluation Campaign (SiSEC2011): - Audio Source Separation -

This paper summarizes the audio part of the 2011 community-based Signal Separation Evaluation Campaign (SiSEC2011). Four speech and music datasets were contributed, including datasets recorded in noisy or dynamic environments and a subset of the SiSEC2010 datasets. The participants addressed one or more tasks out of four source separation tasks, and the results for each task were evaluated using different objective performance criteria. We provide an overview of the audio datasets, tasks and criteria. We also report the results achieved with the submitted systems, and discuss organization strategies for future campaigns.

[1]  Benedikt Loesch,et al.  Adaptive Segmentation and Separation of Determined Convolutive Mixtures under Dynamic Conditions , 2010, LVA/ICA.

[2]  Emmanuel Vincent,et al.  A General Flexible Framework for the Handling of Prior Information in Audio Source Separation , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[3]  Francesco Nesta,et al.  Robust Automatic Speech Recognition through on-line Semi Blind Source Extraction , 2011 .

[4]  Jordi Janer,et al.  A Tikhonov regularization method for spectrum decomposition in low latency audio source separation , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[5]  Benedikt Loesch,et al.  Blind Source Separation Based on Time-Frequency Sparseness in the Presence of Spatial Aliasing , 2010, LVA/ICA.

[6]  Volker Gnann,et al.  Note Clustering Based on 2-D Source-Filter Modeling for Underdetermined Blind Source Separation , 2011, Semantic Audio.

[7]  Francesco Nesta,et al.  Generalized State Coherence Transform for Multidimensional TDOA Estimation of Multiple Sources , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[8]  Saeid Sanei,et al.  A k-subspace based tensor factorization approach for under-determined blind identi??cation , 2010, 2010 Conference Record of the Forty Fourth Asilomar Conference on Signals, Systems and Computers.

[9]  Martin Bouchard,et al.  Real-world particle filtering-based speech enhancement , 2010, 2010 2nd International Workshop on Cognitive Information Processing.

[10]  Zbynek Koldovský,et al.  Semi-blind Source Separation Based on ICA and Overlapped Speech Detection , 2012, LVA/ICA.

[11]  Gaël Richard,et al.  A Musically Motivated Mid-Level Representation for Pitch Estimation and Musical Audio Source Separation , 2011, IEEE Journal of Selected Topics in Signal Processing.

[12]  Rémi Gribonval,et al.  Oracle estimators for the benchmarking of source separation algorithms , 2007, Signal Process..

[13]  Pierre Divenyi Speech Separation by Humans and Machines , 2004 .

[14]  Emmanuel Vincent,et al.  Subjective and Objective Quality Assessment of Audio Source Separation , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[15]  Jean-Philippe Thiran,et al.  Musical Audio Source Separation Based on User-Selected F0 Track , 2012, LVA/ICA.

[16]  Rémi Gribonval,et al.  Latent variable analysis and signal separation , 2012, Signal Process..

[17]  Janghoon Cho,et al.  Underdetermined convolutive blind source separation using a novel mixing matrix estimation and MMSE-based source estimation , 2011, 2011 IEEE International Workshop on Machine Learning for Signal Processing.

[18]  Fabian J. Theis,et al.  The signal separation evaluation campaign (2007-2010): Achievements and remaining challenges , 2012, Signal Process..

[19]  Hirokazu Kameoka,et al.  Efficient algorithms for multichannel extensions of Itakura-Saito nonnegative matrix factorization , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[20]  Ning Ma,et al.  The CHiME corpus: a resource and a challenge for computational hearing in multisource environments , 2010, INTERSPEECH.

[21]  Hiroshi Sawada,et al.  A Two-Stage Frequency-Domain Blind Source Separation Method for Underdetermined Convolutive Mixtures , 2007, 2007 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.

[22]  DeLiang Wang,et al.  On Ideal Binary Mask As the Computational Goal of Auditory Scene Analysis , 2005, Speech Separation by Humans and Machines.

[23]  Tetsuya Ogata,et al.  A GMM Sound Source Model for Blind Speech Separation in Under-determined Conditions , 2012, LVA/ICA.

[24]  Hiroshi Sawada,et al.  Blind source separation of mixed speech in a high reverberation environment , 2011, 2011 Joint Workshop on Hands-free Speech Communication and Microphone Arrays.

[25]  Gerald Schuller,et al.  Influence of Phase, Magnitude and Location ofHarmonic Components in the Perceived Quality of Extracted Solo Signals , 2011, Semantic Audio.

[26]  Fabian J. Theis,et al.  The 2010 Signal Separation Evaluation Campaign (SiSEC2010): Audio Source Separation , 2010, LVA/ICA.

[27]  Francesco Nesta,et al.  On the robustness of the multidimensional state coherence transform for solving the permutation problem of frequency-domain ICA , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[28]  Francesco Nesta,et al.  Convolutive Underdetermined Source Separation through Weighted Interleaved ICA and Spatio-temporal Source Correlation , 2012, LVA/ICA.

[29]  John R. Hershey,et al.  Monaural speech separation and recognition challenge , 2010, Comput. Speech Lang..

[30]  Emmanuel Vincent,et al.  Improved Perceptual Metrics for the Evaluation of Audio Source Separation , 2012, LVA/ICA.

[31]  Rémi Gribonval,et al.  Performance measurement in blind audio source separation , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[32]  Emmanuel Vincent,et al.  Multi-source TDOA estimation in reverberant audio using angular spectra and clustering , 2012, Signal Process..