Unsupervised discovery of activity correlations using latent topic models

Topic models such as probabilistic Latent Semantic Analysis (pLSA) and Latent Dirichlet Allocation (LDA) have been successfully used to discover individual activities in a scene. However these methods do not discover group activities which are commonly observed in real life videos of public places. In this paper we address the problem of discovering activities and their associations as a group activity in an unsupervised manner. We propose a method that uses a two layer hierarchical latent structure to correlate individual activities in lower layer with group activity in higher layer. Our model considers each scene to be composed of a mixture of group activities. Each group activity is in turn composed as a mixture of individual activities represented as multinomial distributions. Each individual activity is represented as a distribution over local visual features. We use a Gibbs sampling based algorithm to infer these activities. Our method can summarize not only the individual activities but also the common group activities in a video. We demonstrate the strength of our method by mining activities and the salient correlation amongst them in real life videos of crowded public scenes.

[1]  Alex Pentland,et al.  Coupled hidden Markov models for complex action recognition , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[2]  Jonathan Huang Maximum Likelihood Estimation of Dirichlet Distribution Parameters , 2005 .

[3]  Subhashis Banerjee,et al.  Time based Activity Inference using Latent Dirichlet Allocation , 2009, BMVC.

[4]  W. Eric L. Grimson,et al.  Learning Semantic Scene Models by Trajectory Analysis , 2006, ECCV.

[5]  Shaogang Gong,et al.  Action categorization by structural probabilistic latent semantic analysis , 2010, Comput. Vis. Image Underst..

[6]  W. Eric L. Grimson,et al.  Unsupervised Activity Perception in Crowded and Complicated Scenes Using Hierarchical Bayesian Models , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Juan Carlos Niebles,et al.  Spatial-Temporal correlatons for unsupervised action classification , 2008, 2008 IEEE Workshop on Motion and video Computing.

[8]  W. Eric L. Grimson,et al.  Unsupervised Activity Perception by Hierarchical Bayesian Models , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  G. Ronning Maximum likelihood estimation of dirichlet distributions , 1989 .

[10]  Serge J. Belongie,et al.  Behavior recognition via sparse spatio-temporal features , 2005, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.

[11]  Larry S. Davis,et al.  Action recognition using ballistic dynamics , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Shaogang Gong,et al.  Global Behaviour Inference using Probabilistic Latent Semantic Analysis , 2008, BMVC.

[13]  Juan Carlos Niebles,et al.  Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words , 2006, BMVC.

[14]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Analysis , 1999, UAI.

[15]  Yang Wang,et al.  Human Action Recognition by Semilatent Topic Models , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Shaogang Gong,et al.  A Markov Clustering Topic Model for mining behaviour in video , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[17]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[18]  Subhashis Banerjee,et al.  Unusual Activity Analysis Using Video Epitomes and pLSA , 2008, 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing.

[19]  W. Eric L. Grimson,et al.  Correspondence-Free Activity Analysis and Scene Modeling in Multiple Camera Views , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Michael I. Jordan,et al.  Hierarchical Dirichlet Processes , 2006 .

[21]  Luc Van Gool,et al.  What's going on? Discovering spatio-temporal dependencies in dynamic scenes , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[22]  Tieniu Tan,et al.  A system for learning statistical motion patterns , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Ramakant Nevatia,et al.  Multi-agent event recognition , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.