Multimodal Laughter Detection in Natural Discourses

This work focuses on the detection of laughter in natural multiparty discourses. For the given task features of two different modalities are used from unobtrusive sources, namely a room microphone and a 360 degree camera. A relatively novel approach using Echo State Networks (ESN) is utilized to achieve the task at hand. Among others, a possible application is the online detection of laughter in human robot interaction in order to enable the robot to react appropriately in a timely fashion towards human communication, since laughter is an important communication utility.

[1]  Douglas A. Reynolds,et al.  A Tutorial on Text-Independent Speaker Verification , 2004, EURASIP J. Adv. Signal Process..

[2]  David A. van Leeuwen,et al.  Automatic discrimination between laughter and speech , 2007, Speech Commun..

[3]  Harald Haas,et al.  Harnessing Nonlinearity: Predicting Chaotic Systems and Saving Energy in Wireless Communication , 2004, Science.

[4]  Daniel P. W. Ellis,et al.  Laughter Detection in Meetings , 2004 .

[5]  Kornel Laskowski Modeling vocal interaction for text-independent detection of involvement hotspots in multi-party meetings , 2008, 2008 IEEE Spoken Language Technology Workshop.

[6]  Nick Campbell,et al.  No laughing matter , 2005, INTERSPEECH.

[7]  R. Plomp,et al.  Effect of reducing slow temporal modulations on speech reception. , 1994, The Journal of the Acoustical Society of America.

[8]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[9]  Günther Palm,et al.  Emotion Recognition from Speech: Stress Experiment , 2008, LREC.

[10]  S. Pugh,et al.  Service with a smile: Emotional contagion in the service encounter. , 2001 .

[11]  Nick Campbell,et al.  Tools & Resources for Visualising Conversational-Speech Interaction , 2008, LREC.

[12]  David A. van Leeuwen,et al.  Automatic detection of laughter , 2005, INTERSPEECH.

[13]  D. V. Leeuwen,et al.  Evaluating automatic laughter segmentation in meetings using acoustic and acoustic-phonetic features , 2007 .

[14]  杨浩,et al.  Advances in SVM-Based System Using GMM Super Vectors for Text-Independent Speaker Verification , 2008 .

[15]  Ramin Shaghaghi Kandovan,et al.  Optimization of speaker verification using adapted Gaussian mixture models for high quality databases , 2007 .

[16]  Yi Luo,et al.  Speech emotion recognition based on a hybrid of HMM/ANN , 2007 .

[17]  Petra-Maria Strauss,et al.  Evaluation and user acceptance of a dialogue system using Wizard-Of-Oz recordings , 2007 .

[18]  Fabrice Rossi,et al.  Support Vector Machine For Functional Data Classification , 2006, ESANN.

[19]  Liang Lu,et al.  Advances in SVM-based system using GMM super vectors for text-independent speaker verification , 2008 .

[20]  H. Hermansky,et al.  The modulation spectrum in the automatic recognition of speech , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.

[21]  Jeff A. Bilmes,et al.  A gentle tutorial of the em algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models , 1998 .

[22]  Günther Palm,et al.  A Novel Feature for Emotion Recognition in Voice Based Applications , 2007, ACII.

[23]  Günther Palm,et al.  Real-Time Emotion Recognition from Speech Using Echo State Networks , 2008, ANNPR.

[24]  Nikki Mirghafori,et al.  Automatic laughter detection using neural networks , 2007, INTERSPEECH.