论文信息 - Monaural Source Separation Using Spectral Cues

Monaural Source Separation Using Spectral Cues

The acoustic environment poses at least two important challenges. First, animals must localise sound sources using a variety of binaural and monaural cues; and second they must separate sources into distinct auditory streams (the “cocktail party problem”). Binaural cues include intra-aural intensity and phase disparity. The primary monaural cue is the spectral filtering introduced by the head and pinnae via the head-related transfer function (HRTF), which imposes different linear filters upon sources arising at different spatial locations.

Barak A. Pearlmutter | Anthony M. Zador | A. Zador

[1] Yoshitaka Nakajima,et al. Auditory Scene Analysis: The Perceptual Organization of Sound Albert S. Bregman , 1992 .

[2] Eric I. Knudsen,et al. Incremental training increases the plasticity of the auditory space map in adult barn owls , 2002, Nature.

[3] Bruno A. Olshausen,et al. A new window on sound , 2002, Nature Neuroscience.

[4] Sam T. Roweis,et al. One Microphone Source Separation , 2000, NIPS.

[5] Masakazu Konishi,et al. Mechanisms of sound localization in the barn owl (Tyto alba) , 1979, Journal of comparative physiology.

[6] David J. Field,et al. Sparse coding with an overcomplete basis set: A strategy employed by V1? , 1997, Vision Research.

[7] Michael Elad,et al. Optimally sparse representation in general (nonorthogonal) dictionaries via ℓ1 minimization , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[8] Michael A. Saunders,et al. Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..

[9] Paul M. Hofman,et al. Relearning sound localization with new ears , 1998, Nature Neuroscience.

[10] S. Rickard,et al. DOA estimation of many W-disjoint orthogonal sources from two mixtures using DUET , 2000, Proceedings of the Tenth IEEE Workshop on Statistical Signal and Array Processing (Cat. No.00TH8496).

[11] D. Donoho,et al. Maximal Sparsity Representation via l 1 Minimization , 2002 .

[12] F L Wightman,et al. Headphone simulation of free-field listening. II: Psychophysical validation. , 1989, The Journal of the Acoustical Society of America.

[13] Tomaso Poggio,et al. Computational vision and regularization theory , 1985, Nature.

[14] Michael Zibulevsky,et al. Underdetermined blind source separation using sparse representations , 2001, Signal Process..

[15] Te-Won Lee,et al. A Maximum Likelihood Approach to Single-channel Source Separation , 2003, J. Mach. Learn. Res..

[16] Terrence J. Sejnowski,et al. The “independent components” of natural scenes are edge filters , 1997, Vision Research.

[17] Terrence J. Sejnowski,et al. Learning Overcomplete Representations , 2000, Neural Computation.

[18] David J. Field,et al. Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[19] Paul M. Hofman,et al. Bayesian reconstruction of sound localization cues from responses to random spectra , 2002, Biological Cybernetics.

[20] Terrence J. Sejnowski,et al. Blind source separation of more sources than mixtures using overcomplete representations , 1999, IEEE Signal Processing Letters.

[21] Barak A. Pearlmutter,et al. Blind Source Separation by Sparse Decomposition in a Signal Dictionary , 2001, Neural Computation.

[22] A J King,et al. Plasticity in the neural coding of auditory space in the mammalian brain. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[23] H. Steven Colburn,et al. Role of spectral detail in sound-source localization , 1998, Nature.

[24] Gert Cauwenberghs,et al. Monaural separation of independent acoustical components , 1999, ISCAS'99. Proceedings of the 1999 IEEE International Symposium on Circuits and Systems VLSI (Cat. No.99CH36349).

[25] B. Shinn-Cunningham. Models of Plasticity in Spatial Auditory Processing , 2001, Audiology and Neurotology.

[26] F L Wightman,et al. Localization using nonindividualized head-related transfer functions. , 1993, The Journal of the Acoustical Society of America.

[27] S. Sheft,et al. A simulated “cocktail party” with up to three sound sources , 1996, Perception & psychophysics.

[28] R. Fletcher. Semi-Definite Matrix Constraints in Optimization , 1985 .

[29] Tomaso Poggio,et al. Models of object recognition , 2000, Nature Neuroscience.

[30] Michael C. Mozer,et al. Monaural Separation and Classification of Mixed Signals : a Support-vector Regression Perspective , 2001 .

[31] Bruno A. Olshausen,et al. Inferring Sparse, Overcomplete Image Codes Using an Efficient Coding Framework , 1998, NIPS.