Multimodal Emotion Recognition using Deep Continuous Conditional Recurrent Neural Fields

A deep Continuous Conditional Recurrent Neural Fields (CCRNF) framework is presented in this paper to model dimensional emotion from multiple input features. The deep architecture is effected by stacking multiple gated recurrent neural networks to model complex, non-linear relationships across time and space. The effect of increasing layer depth is studied through a comparative performance analysis and a visual depiction of the gate activations. The resulting visual analysis provides insight into the flow of information across time and multiple layers. The paper further investigates the use of model uncertainty as captured in the Gaussian distribution of the model, and explores the use of inverse variances in the fusion of model decisions. This latter study serves as an initial discussion in quantifying model and prediction confidence in continuous conditional random fields.

[1]  Andries Petrus Engelbrecht,et al.  Bio-acoustic emotion recognition using continuous conditional recurrent neural fields , 2017, 2017 IEEE Symposium Series on Computational Intelligence (SSCI).

[2]  Elena Tutubalina,et al.  Combination of Deep Recurrent Neural Networks and Conditional Random Fields for Extracting Adverse Drug Reactions from User Reviews , 2017, Journal of healthcare engineering.

[3]  Ole Winther,et al.  Deep Recurrent Conditional Random Field Network for Protein Secondary Prediction , 2017, BCB.

[4]  Nicu Sebe,et al.  Multi-scale Continuous CRFs as Sequential Deep Networks for Monocular Depth Estimation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Fabien Ringeval,et al.  AVEC 2016: Depression, Mood, and Emotion Recognition Workshop and Challenge , 2016, AVEC@ACM Multimedia.

[6]  Dongmei Jiang,et al.  Multimodal Affective Dimension Prediction Using Deep Bidirectional Long Short-Term Memory Recurrent Neural Networks , 2015, AVEC@ACM Multimedia.

[7]  Ya Li,et al.  Long Short Term Memory Recurrent Neural Network based Multimodal Dimensional Emotion Recognition , 2015, AVEC@ACM Multimedia.

[8]  Maja Pantic,et al.  Multi-modal Neural Conditional Ordinal Random Fields for agreement level estimation , 2015, 2016 23rd International Conference on Pattern Recognition (ICPR).

[9]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[10]  Peter Robinson,et al.  Continuous Conditional Neural Fields for Structured Regression , 2014, ECCV.

[11]  Michel F. Valstar,et al.  Local Gabor Binary Patterns from Three Orthogonal Planes for Automatic Facial Expression Recognition , 2013, 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction.

[12]  Christian Gagné,et al.  Sequential emotion recognition using Latent-Dynamic Conditional Neural Fields , 2013, 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[13]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[14]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[15]  Razvan Pascanu,et al.  On the difficulty of training recurrent neural networks , 2012, ICML.

[16]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[17]  Koby Crammer,et al.  Confidence in Structured-Prediction Using Confidence-Weighted Models , 2010, EMNLP.

[18]  Zoran Obradovic,et al.  Continuous Conditional Random Fields for Regression in Remote Sensing , 2010, ECAI.

[19]  Trevor Darrell,et al.  Latent-Dynamic Discriminative Models for Continuous Gesture Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Alex Bateman,et al.  An introduction to hidden Markov models. , 2007, Current protocols in bioinformatics.

[21]  Miroslav Dudík,et al.  A maximum entropy approach to species distribution modeling , 2004, ICML.

[22]  Carla E. Brodley,et al.  Proceedings of the twenty-first international conference on Machine learning , 2004, International Conference on Machine Learning.

[23]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[24]  John D. Lafferty,et al.  Inducing Features of Random Fields , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[25]  W. Mattice,et al.  The multivariate Gaussian distribution and the dipole moments of perturbed chains , 1985 .

[26]  W. G. Cochran The combination of estimates from different experiments. , 1954 .

[27]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[28]  A. Lapidoth The Multivariate Gaussian Distribution , 2009 .