Factored conditional restricted Boltzmann Machines for modeling motion style

The Conditional Restricted Boltzmann Machine (CRBM) is a recently proposed model for time series that has a rich, distributed hidden state and permits simple, exact inference. We present a new model, based on the CRBM that preserves its most important computational properties and includes multiplicative three-way interactions that allow the effective interaction weight between two units to be modulated by the dynamic state of a third unit. We factor the three-way weight tensor implied by the multiplicative model, reducing the number of parameters from O(N3) to O(N2). The result is an efficient, compact model whose effectiveness we demonstrate by modeling human motion. Like the CRBM, our model can capture diverse styles of motion with a single set of parameters, and the three-way interactions greatly improve the model's ability to blend motion styles or to transition smoothly among them.

[1]  Paul Smolensky,et al.  Information processing in dynamical systems: foundations of harmony theory , 1986 .

[2]  David Haussler,et al.  Unsupervised learning of distributions on binary vectors using two layer networks , 1991, NIPS 1991.

[3]  Aaron Hertzmann,et al.  Style machines , 2000, SIGGRAPH 2000.

[4]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[5]  Neil D. Lawrence,et al.  Gaussian Process Latent Variable Models for Visualisation of High Dimensional Data , 2003, NIPS.

[6]  Geoffrey E. Hinton,et al.  Exponential Family Harmoniums with an Application to Information Retrieval , 2004, NIPS.

[7]  Kari Pulli,et al.  Style translation for human motion , 2005, SIGGRAPH 2005.

[8]  Miguel Á. Carreira-Perpiñán,et al.  On Contrastive Divergence Learning , 2005, AISTATS.

[9]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[10]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[11]  Lorenzo Torresani,et al.  Learning Motion Style Synthesis from Perceptual Observations , 2006, NIPS.

[12]  Geoffrey E. Hinton,et al.  Restricted Boltzmann machines for collaborative filtering , 2007, ICML '07.

[13]  Neil D. Lawrence,et al.  Learning for Larger Datasets with the Gaussian Process Latent Variable Model , 2007, AISTATS.

[14]  Honglak Lee,et al.  Sparse deep belief net model for visual area V2 , 2007, NIPS.

[15]  David J. Fleet,et al.  Multifactor Gaussian process models for style-content separation , 2007, ICML '07.

[16]  Geoffrey E. Hinton,et al.  Unsupervised Learning of Image Transformations , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  B. Schölkopf,et al.  Modeling Human Motion Using Binary Latent Variables , 2007 .

[18]  Geoffrey E. Hinton,et al.  Modeling image patches with a directed hierarchy of Markov random fields , 2007, NIPS.

[19]  Ruslan Salakhutdinov,et al.  On the quantitative analysis of deep belief networks , 2008, ICML '08.

[20]  Roland Memisevic,et al.  Non-linear Latent Factor Models for Revealing Structure in High-dimensional Data , 2008 .

[21]  Tijmen Tieleman,et al.  Training restricted Boltzmann machines using approximations to the likelihood gradient , 2008, ICML '08.