Learning Stochastic Recurrent Networks

Leveraging advances in variational inference, we propose to enhance recurrent neural networks with latent variables, resulting in Stochastic Recurrent Networks (STORNs). The model i) can be trained with stochastic gradient methods, ii) allows structured and multi-modal conditionals at each time step, iii) features a reliable estimator of the marginal likelihood and iv) is a generalisation of deterministic recurrent neural networks. We evaluate the method on four polyphonic musical data sets and motion capture data.

[1]  G. Grimmett,et al.  Probability and random processes , 2002 .

[2]  Eduardo Sontag,et al.  Turing computability with neural nets , 1991 .

[3]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[4]  Barbara Hammer,et al.  On the approximation capability of recurrent neural networks , 2000, Neurocomputing.

[5]  Geoffrey E. Hinton,et al.  Exponential Family Harmoniums with an Application to Information Retrieval , 2004, NIPS.

[6]  Kari Pulli,et al.  Style translation for human motion , 2005, SIGGRAPH 2005.

[7]  Geoffrey E. Hinton,et al.  Modeling Human Motion Using Binary Latent Variables , 2006, NIPS.

[8]  Jürgen Schmidhuber,et al.  Unconstrained On-line Handwriting Recognition with Recurrent Neural Networks , 2007, NIPS.

[9]  A. Graves,et al.  Unconstrained Online Handwriting Recognition with Recurrent Neural Networks , 2007 .

[10]  Geoffrey E. Hinton,et al.  The Recurrent Temporal Restricted Boltzmann Machine , 2008, NIPS.

[11]  Ilya Sutskever,et al.  Learning Recurrent Neural Networks with Hessian-Free Optimization , 2011, ICML.

[12]  Yoshua Bengio,et al.  Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..

[13]  Yoshua Bengio,et al.  Modeling Temporal Dependencies in High-Dimensional Sequences: Application to Polyphonic Music Generation and Transcription , 2012, ICML.

[14]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[15]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[16]  R. Salakhutdinov,et al.  A New Learning Algorithm for Stochastic Feedforward Neural Nets , 2013 .

[17]  Alex Graves,et al.  Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.

[18]  Christian Osendorfer,et al.  Training Neural Networks with Implicit Variance , 2013, ICONIP.

[19]  Yoshua Bengio,et al.  High-dimensional sequence transduction , 2012, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[20]  Geoffrey E. Hinton,et al.  On the importance of initialization and momentum in deep learning , 2013, ICML.

[21]  Razvan Pascanu,et al.  Advances in optimizing recurrent networks , 2012, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[22]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[23]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[24]  Razvan Pascanu,et al.  How to Construct Deep Recurrent Neural Networks , 2013, ICLR.

[25]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[26]  Christian Osendorfer,et al.  On Fast Dropout and its Applicability to Recurrent Networks , 2013, ICLR.

[27]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[28]  Daan Wierstra,et al.  Stochastic Back-propagation and Variational Inference in Deep Latent Gaussian Models , 2014, ArXiv.

[29]  Xinyun Chen Under Review as a Conference Paper at Iclr 2017 Delving into Transferable Adversarial Ex- Amples and Black-box Attacks , 2016 .