Mind the Gap when Conditioning Amortised Inference in Sequential Latent-Variable Models

Amortised inference enables scalable learning of sequential latent-variable models (LVMs) with the evidence lower bound (ELBO). In this setting, variational posteriors are often only partially conditioned. While the true posteriors depend, e.g., on the entire sequence of observations, approximate posteriors are only informed by past observations. This mimics the Bayesian filter—a mixture of smoothing posteriors. Yet, we show that the ELBO objective forces partially–conditioned amortised posteriors to approximate products of smoothing posteriors instead. Consequently, the learned generative model is compromised. We demonstrate these theoretical findings in three scenarios: traffic flow, handwritten digits, and aerial vehicle dynamics. Using fully-conditioned approximate posteriors, performance improves in terms of generative modelling and multi-step prediction.

[1]  Ruben Villegas,et al.  Learning Latent Dynamics for Planning from Pixels , 2018, ICML.

[2]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[3]  Sertac Karaman,et al.  The Blackbird Dataset: A large-scale dataset for UAV perception in aggressive flight , 2018, ISER.

[4]  Christian Osendorfer,et al.  Learning Stochastic Recurrent Networks , 2014, NIPS 2014.

[5]  Rob Fergus,et al.  Stochastic Video Generation with a Learned Prior , 2018, ICML.

[6]  Peter Harremoës,et al.  Rényi Divergence and Kullback-Leibler Divergence , 2012, IEEE Transactions on Information Theory.

[7]  Justin Bayer,et al.  Variational Inference for On-line Anomaly Detection in High-Dimensional Time Series , 2016, ArXiv.

[8]  Léon Bottou,et al.  Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.

[9]  Ruslan Salakhutdinov,et al.  Importance Weighted Autoencoders , 2015, ICLR.

[10]  Yoshua Bengio,et al.  A Recurrent Latent Variable Model for Sequential Data , 2015, NIPS.

[11]  Patrick van der Smagt,et al.  Unsupervised Real-Time Control Through Variational Empowerment , 2017, ISRR.

[12]  Dmitry Vetrov,et al.  Variational Autoencoder with Arbitrary Conditioning , 2018, ICLR.

[13]  Diederik P. Kingma,et al.  Variational Recurrent Auto-Encoders , 2014, ICLR.

[14]  Patrick van der Smagt,et al.  Learning Hierarchical Priors in VAEs , 2019, NeurIPS.

[15]  Yee Whye Teh,et al.  Sequential Attend, Infer, Repeat: Generative Modelling of Moving Objects , 2018, NeurIPS.

[16]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[17]  Richard E. Turner,et al.  Two problems with variational expectation maximisation for time-series models , 2011 .

[18]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[19]  Zhiyong Cui,et al.  High-Order Graph Convolutional Recurrent Neural Network: A Deep Learning Framework for Network-Scale Traffic Learning and Forecasting , 2018, ArXiv.

[20]  Karol Gregor,et al.  Temporal Difference Variational Auto-Encoder , 2018, ICLR.

[21]  Ole Winther,et al.  A Disentangled Recognition and Nonlinear Dynamics Model for Unsupervised Learning , 2017, NIPS.

[22]  Guokun Lai,et al.  Re-examination of the Role of Latent Variables in Sequence Modeling , 2019, NeurIPS.

[23]  Aaron C. Courville,et al.  Note on the bias and variance of variational inference , 2019, ArXiv.

[24]  Yee Whye Teh,et al.  Filtering Variational Objectives , 2017, NIPS.

[25]  S. Brendle,et al.  Calculus of Variations , 1927, Nature.

[26]  Juan Carlos Niebles,et al.  Learning to Decompose and Disentangle Representations for Video Prediction , 2018, NeurIPS.

[27]  Sebastian Nowozin,et al.  Adversarial Variational Bayes: Unifying Variational Autoencoders and Generative Adversarial Networks , 2017, ICML.

[28]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[29]  Zhiyong Cui,et al.  Deep Bidirectional and Unidirectional LSTM Recurrent Neural Network for Network-wide Traffic Speed Prediction , 2018, ArXiv.

[30]  Richard E. Turner,et al.  Rényi Divergence Variational Inference , 2016, NIPS.

[31]  Shakir Mohamed,et al.  Variational Inference with Normalizing Flows , 2015, ICML.

[32]  Justin Bayer,et al.  Approximate Bayesian inference in spatial environments , 2018, Robotics: Science and Systems.

[33]  Maximilian Karl,et al.  Learning to Fly via Deep Model-Based Reinforcement Learning , 2020, ArXiv.

[34]  Andriy Mnih,et al.  Variational Inference for Monte Carlo Objectives , 2016, ICML.

[35]  Ole Winther,et al.  Sequential Neural Models with Stochastic Layers , 2016, NIPS.

[36]  Geoffrey E. Hinton,et al.  The "wake-sleep" algorithm for unsupervised neural networks. , 1995, Science.

[37]  Sebastian Nowozin,et al.  Debiasing Evidence Approximations: On Importance-weighted Autoencoders and Jackknife Variational Inference , 2018, ICLR.

[38]  U. V. Luxburg,et al.  Improving Variational Autoencoders with Inverse Autoregressive Flow , 2016 .

[39]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[40]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[41]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[42]  Yuval Tassa,et al.  MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[43]  Simo Särkkä,et al.  Bayesian Filtering and Smoothing , 2013, Institute of Mathematical Statistics textbooks.

[44]  Fabio Viola,et al.  Learning and Querying Fast Generative Models for Reinforcement Learning , 2018, ArXiv.

[45]  Sergey Levine,et al.  Stochastic Latent Actor-Critic: Deep Reinforcement Learning with a Latent Variable Model , 2019, NeurIPS.

[46]  Matthew J. Beal Variational algorithms for approximate Bayesian inference , 2003 .

[47]  Chong Wang,et al.  Stochastic variational inference , 2012, J. Mach. Learn. Res..

[48]  Kuldip K. Paliwal,et al.  Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..

[49]  Il Memming Park,et al.  BLACK BOX VARIATIONAL INFERENCE FOR STATE SPACE MODELS , 2015, 1511.07367.

[50]  Patrick van der Smagt,et al.  Switching Linear Dynamics for Variational Bayes Filtering , 2019, ICML.

[51]  Justin Bayer,et al.  Variational Tracking and Prediction with Generative Disentangled State-Space Models , 2019, ArXiv.

[52]  Sean Gerrish,et al.  Black Box Variational Inference , 2013, AISTATS.

[53]  Maximilian Karl,et al.  Deep Variational Bayes Filters: Unsupervised Learning of State Space Models from Raw Data , 2016, ICLR.

[54]  David Duvenaud,et al.  Inference Suboptimality in Variational Autoencoders , 2018, ICML.