A new discriminant NMF algorithm and its application to the extraction of subtle emotional differences in speech

In this study we propose a new feature extraction algorithm, dNMF (discriminant non-negative matrix factorization), to learn subtle class-related differences while maintaining an accurate generative capability. In addition to the minimum representation error for the standard NMF (non-negative matrix factorization) algorithm, the dNMF algorithm also results in higher between-class variance for discriminant power. The multiplicative NMF learning algorithm has been modified to cope with this additional constraint. The cost function was carefully designed so that the extraction of feature coefficients from a single testing pattern with pre-trained feature vectors resulted in a quadratic convex optimization problem in non-negative space for uniqueness. It also resolves issues related to the previous discriminant NMF algorithms. The developed dNMF algorithm has been applied to the emotion recognition task for speech, where it needs to emphasize the emotional differences while de-emphasizing the dominant phonetic components. The dNMF algorithm successfully extracted subtle emotional differences, demonstrated much better recognition performance and showed a smaller representation error from an emotional speech database.

[1]  Feng Li,et al.  Semi-supervised joint spatio-temporal feature selection for P300-based BCI speller , 2011, Cognitive Neurodynamics.

[2]  Shun-ichi Amari,et al.  Representative and Discriminant Feature Extraction Based on NMF for Emotion Recognition in Speech , 2009, ICONIP.

[3]  Oudeyer Pierre-Yves,et al.  The production and recognition of emotions in speech: features and algorithms , 2003 .

[4]  Soo-Young Lee,et al.  Learning self-organized topology-preserving complex speech features at primary auditory cortex , 2005, Neurocomputing.

[5]  Ho-Young Jung,et al.  On the Efficient Speech Feature Extraction Based on Independent Component Analysis , 2002, Neural Processing Letters.

[6]  Avinash C. Kak,et al.  PCA versus LDA , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Patrik O. Hoyer,et al.  Non-negative Matrix Factorization with Sparseness Constraints , 2004, J. Mach. Learn. Res..

[8]  Erkki Oja,et al.  Linear and Nonlinear Projective Nonnegative Matrix Factorization , 2010, IEEE Transactions on Neural Networks.

[9]  Ioannis Pitas,et al.  A Novel Discriminant Non-Negative Matrix Factorization Algorithm With Applications to Facial Image Characterization Problems , 2007, IEEE Transactions on Information Forensics and Security.

[10]  Tiago H. Falk,et al.  Automatic speech emotion recognition using modulation spectral features , 2011, Speech Commun..

[11]  Liqing Zhang,et al.  Generalized optimal spatial filtering using a kernel approach with application to EEG classification , 2010, Cognitive Neurodynamics.

[12]  Constantine Kotropoulos,et al.  Emotional speech recognition: Resources, features, and methods , 2006, Speech Commun..

[13]  John H. L. Hansen,et al.  Nonlinear feature based classification of speech under stress , 2001, IEEE Trans. Speech Audio Process..

[14]  Michael S. Lewicki,et al.  Efficient coding of natural sounds , 2002, Nature Neuroscience.

[15]  Chun Chen,et al.  Emotional Speech Analysis on Nonlinear Manifold , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[16]  Soo-Young Lee,et al.  Discriminant Independent Component Analysis , 2011, IEEE Trans. Neural Networks.

[17]  Gang Wei,et al.  Speech emotion recognition based on HMM and SVM , 2005, 2005 International Conference on Machine Learning and Cybernetics.

[18]  Malcolm Slaney,et al.  Baby Ears: a recognition system for affective vocalizations , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[19]  Astrid Paeschke,et al.  A database of German emotional speech , 2005, INTERSPEECH.

[20]  Junzhong Zou,et al.  Feature extraction and recognition of epileptiform activity in EEG by combining PCA with ApEn , 2010, Cognitive Neurodynamics.

[21]  Stefanos Zafeiriou,et al.  Nonlinear Non-Negative Component Analysis Algorithms , 2010, IEEE Transactions on Image Processing.

[22]  Mark D. Plumbley,et al.  Theorems on Positive Data: On the Uniqueness of NMF , 2008, Comput. Intell. Neurosci..

[23]  Yunde Jia,et al.  Non-negative matrix factorization framework for face recognition , 2005, Int. J. Pattern Recognit. Artif. Intell..

[24]  Anastasios Tefas,et al.  Exploiting discriminant information in nonnegative matrix factorization with application to frontal face verification , 2006, IEEE Transactions on Neural Networks.

[25]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[26]  Pierre-Yves Oudeyer,et al.  The production and recognition of emotions in speech: features and algorithms , 2003, Int. J. Hum. Comput. Stud..

[27]  Terrence J. Sejnowski,et al.  The “independent components” of natural scenes are edge filters , 1997, Vision Research.