Deterministic Variational Inference for Neural SDEs

Neural Stochastic Differential Equations (NSDEs) model the drift and diffusion functions of a stochastic process as neural networks. While NSDEs are known to predict time series accurately, their uncertainty quantification properties remain unexplored. Currently, there are no approximate inference methods, which allow flexible models and provide at the same time high quality uncertainty estimates at a reasonable computational cost. Existing SDE inference methods either make overly restrictive assumptions, e.g. linearity, or rely on Monte Carlo integration that requires many samples at prediction time for reliable uncertainty quantification. However, many real-world safety critical applications necessitate highly expres-sive models that can quantify prediction uncertainty at affordable computational cost. We introduce a variational inference scheme that approximates the posterior distribution of a NSDE governing a latent state space by a deterministic chain of operations. We approximate the intractable data fit term of the evidence lower bound by a novel bidimensional moment matching algorithm: vertical along the neural net layers and horizontal along the time direction. Our algorithm achieves uncertainty calibration scores that can be matched by its sampling-based counter-parts only at significantly higher computation cost, while providing as accurate forecasts on system dynamics.

[1]  Martin Jorgensen,et al.  Stochastic Differential Equations with Variational Wishart Diffusions , 2020, ICML.

[2]  Jun Zhu,et al.  Calibrated Reliable Regression using Maximum Mean Discrepancy , 2020, NeurIPS.

[3]  Philippe Wenk,et al.  A Real-Robot Dataset for Assessing Transferability of Learned Dynamics Models , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[4]  M. Kandemir,et al.  Bayesian Evidential Deep Learning with PAC Regularization. , 2020 .

[5]  David Duvenaud,et al.  Scalable Gradients for Stochastic Differential Equations , 2020, AISTATS.

[6]  Jayaraman J. Thiagarajan,et al.  Building Calibrated Deep Models via Uncertainty Matching with Auxiliary Interval Predictors , 2019, AAAI.

[7]  Arno Solin,et al.  Applied Stochastic Differential Equations , 2019 .

[8]  Maxim Raginsky,et al.  Neural Stochastic Differential Equations: Deep Latent Gaussian Models in the Diffusion Limit , 2019, ArXiv.

[9]  Philippe Wenk,et al.  AReS and MaRS - Adversarial and MMD-Minimizing Regression for SDEs , 2019, ICML.

[10]  Samuel Kaski,et al.  Deep learning with differential Gaussian process flows , 2018, AISTATS.

[11]  Sebastian Nowozin,et al.  Deterministic Variational Inference for Robust Bayesian Neural Networks , 2018, ICLR.

[12]  Stefano Ermon,et al.  Accurate Uncertainties for Deep Learning Using Calibrated Regression , 2018, ICML.

[13]  David Duvenaud,et al.  Neural Ordinary Differential Equations , 2018, NeurIPS.

[14]  A. Stephen McGough,et al.  Black-Box Variational Inference for Stochastic Differential Equations , 2018, ICML.

[15]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[16]  Jing He,et al.  Cautionary tales on air-quality improvement in Beijing , 2017, Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[17]  Jean Daunizeau,et al.  Semi-analytical approximations to statistical moments of sigmoid and softmax mappings of normal variables , 2017, 1703.00091.

[18]  Soumya Ghosh,et al.  Assumed Density Filtering Methods for Learning Bayesian Neural Networks , 2016, AAAI.

[19]  Zoubin Ghahramani,et al.  Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[20]  Ryan P. Adams,et al.  Probabilistic Backpropagation for Scalable Learning of Bayesian Neural Networks , 2015, ICML.

[21]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[22]  Alberto Contreras-Cristán,et al.  A Bayesian Nonparametric Approach for Time Series Clustering , 2014 .

[23]  Christopher D. Manning,et al.  Fast dropout training , 2013, ICML.

[24]  Simo Särkkä,et al.  Gaussian filtering and smoothing for continuous-discrete dynamic systems , 2013, Signal Process..

[25]  N-DIMENSIONAL CUMULATIVE FUNCTION, AND OTHER USEFUL FACTS ABOUT GAUSSIANS AND NORMAL DENSITIES by Michaël Bensimhoun (reviewed by David Arnon) , 2013 .

[26]  Neil D. Lawrence,et al.  Bayesian Gaussian Process Latent Variable Model , 2010, AISTATS.

[27]  Dan Cornford,et al.  Gaussian Process Approximations of Stochastic Differential Equations , 2007, Gaussian Processes in Practice.

[28]  Jun S. Liu,et al.  Siegel ’ s formula via Stein ’ s identities , 2003 .

[29]  Rudolph van der Merwe,et al.  The unscented Kalman filter for nonlinear estimation , 2000, Proceedings of the IEEE 2000 Adaptive Systems for Signal Processing, Communications, and Control Symposium (Cat. No.00EX373).

[30]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[31]  B. Øksendal Stochastic differential equations : an introduction with applications , 1987 .