Variational State-Space Models for Localisation and Dense 3D Mapping in 6 DoF

We solve the problem of 6-DoF localisation and 3D dense reconstruction in spatial environments as approximate Bayesian inference in a deep state-space model. Our approach leverages both learning and domain knowledge from multiple-view geometry and rigid-body dynamics. This results in an expressive predictive model of the world, often missing in current state-of-the-art visual SLAM solutions. The combination of variational inference, neural networks and a differentiable raycaster ensures that our model is amenable to end-to-end gradient-based optimisation. We evaluate our approach on realistic unmanned aerial vehicle flight data, nearing the performance of state-of-the-art visual-inertial odometry systems. We demonstrate the applicability of the model to generative prediction and planning.

[1]  Maximilian Karl,et al.  Learning to Fly via Deep Model-Based Reinforcement Learning , 2020, ArXiv.

[2]  Yannick Hold-Geoffroy,et al.  Neural Reflectance Fields for Appearance Acquisition , 2020, ArXiv.

[3]  Stefan Leutenegger,et al.  CodeSLAM - Learning a Compact, Optimisable Representation for Dense Visual SLAM , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[4]  John J. Leonard,et al.  Past, Present, and Future of Simultaneous Localization and Mapping: Toward the Robust-Perception Age , 2016, IEEE Transactions on Robotics.

[5]  Sergey Levine,et al.  Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models , 2018, NeurIPS.

[6]  Andrew J. Davison,et al.  DeepFactors: Real-Time Probabilistic Dense Monocular SLAM , 2020, IEEE Robotics and Automation Letters.

[7]  Yoshua Bengio,et al.  Large-Scale Learning of Embeddings with Reconstruction Sampling , 2011, ICML.

[8]  Ingmar Posner,et al.  GENESIS: Generative Scene Inference and Sampling with Object-Centric Latent Representations , 2019, ICLR.

[9]  Koray Kavukcuoglu,et al.  Neural scene representation and rendering , 2018, Science.

[10]  Ole Winther,et al.  Sequential Neural Models with Stochastic Layers , 2016, NIPS.

[11]  Fabio Viola,et al.  Generative Temporal Models with Spatial Memory for Partially Observed Environments , 2018, ICML.

[12]  Aaron van den Oord,et al.  Shaping Belief States with Generative Environment Models for RL , 2019, NeurIPS.

[13]  Jürgen Schmidhuber,et al.  Recurrent World Models Facilitate Policy Evolution , 2018, NeurIPS.

[14]  Mohammad Norouzi,et al.  Dream to Control: Learning Behaviors by Latent Imagination , 2019, ICLR.

[15]  Xiaoyue Jiang,et al.  Robust Linear-Complexity Approach to Full SLAM Problems: Stochastic Variational Bayes Inference , 2019, 2019 IEEE 90th Vehicular Technology Conference (VTC2019-Fall).

[16]  Maximilian Karl,et al.  Deep Variational Bayes Filters: Unsupervised Learning of State Space Models from Raw Data , 2016, ICLR.

[17]  Stefan Leutenegger,et al.  Towards the Probabilistic Fusion of Learned Priors into Standard Pipelines for 3D Reconstruction , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[18]  Juan D. Tardós,et al.  ORB-SLAM2: An Open-Source SLAM System for Monocular, Stereo, and RGB-D Cameras , 2016, IEEE Transactions on Robotics.

[19]  Honglak Lee,et al.  Control of Memory, Active Perception, and Action in Minecraft , 2016, ICML.

[20]  Ruslan Salakhutdinov,et al.  Neural Map: Structured Memory for Deep Reinforcement Learning , 2017, ICLR.

[21]  Shaojie Shen,et al.  VINS-Mono: A Robust and Versatile Monocular Visual-Inertial State Estimator , 2017, IEEE Transactions on Robotics.

[22]  Shaojie Shen,et al.  A General Optimization-based Framework for Local Odometry Estimation with Multiple Sensors , 2019, ArXiv.

[23]  Gordon Wetzstein,et al.  Implicit Neural Representations with Periodic Activation Functions , 2020, NeurIPS.

[24]  Ruslan Salakhutdinov,et al.  Active Neural Localization , 2018, ICLR.

[25]  Daan Wierstra,et al.  Recurrent Environment Simulators , 2017, ICLR.

[26]  Daniel Cremers,et al.  Direct Sparse Odometry , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[28]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[29]  Sertac Karaman,et al.  The Blackbird UAV dataset , 2020, Int. J. Robotics Res..

[30]  Julien Cornebise,et al.  Weight Uncertainty in Neural Networks , 2015, ArXiv.

[31]  Yoshua Bengio,et al.  A Recurrent Latent Variable Model for Sequential Data , 2015, NIPS.

[32]  Ruben Villegas,et al.  Learning Latent Dynamics for Planning from Pixels , 2018, ICML.

[33]  HirschmullerHeiko Stereo Processing by Semiglobal Matching and Mutual Information , 2008 .

[34]  Ganesh Iyer,et al.  ∇SLAM: Dense SLAM meets Automatic Differentiation , 2019, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[35]  Xiaoyue Jiang,et al.  Linear-complexity stochastic variational Bayes inference for SLAM , 2017, 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC).

[36]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[37]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[38]  Jonathan T. Barron,et al.  NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis , 2020, ECCV.

[39]  Justin Bayer,et al.  Approximate Bayesian inference in spatial environments , 2018, Robotics: Science and Systems.

[40]  Sebastian Thrun,et al.  FastSLAM: a factored solution to the simultaneous localization and mapping problem , 2002, AAAI/IAAI.

[41]  Kevin P. Murphy,et al.  Bayesian Map Learning in Dynamic Environments , 1999, NIPS.

[42]  Andrew J. Davison,et al.  DTAM: Dense tracking and mapping in real-time , 2011, 2011 International Conference on Computer Vision.

[43]  Davide Scaramuzza,et al.  VIMO: Simultaneous Visual Inertial Model-Based Odometry and Force Estimation , 2019, IEEE Robotics and Automation Letters.

[44]  Fabio Tozeto Ramos,et al.  Bayesian Hilbert Maps for Dynamic Continuous Occupancy Mapping , 2017, CoRL.

[45]  Ping Tan,et al.  BA-Net: Dense Bundle Adjustment Network , 2018, ICLR.

[46]  Thomas Brox,et al.  DeepTAM: Deep Tracking and Mapping , 2018, ECCV.

[47]  Wulfram Gerstner,et al.  Efficient Model-Based Deep Reinforcement Learning with Variational State Tabulation , 2018, ICML.

[48]  Felix Berkenkamp,et al.  Structured Variational Inference in Partially Observable UnstableGaussian Process State Space Models , 2020, L4DC.