Multi-Source Neural Variational Inference

Learning from multiple sources of information is an important problem in machine-learning research. The key challenges are learning representations and formulating inference methods that take into account the complementarity and redundancy of various information sources. In this paper we formulate a variational autoencoder based multi-source learning framework in which each encoder is conditioned on a different information source. This allows us to relate the sources via the shared latent variables by computing divergence measures between individual source’s posterior approximations. We explore a variety of options to learn these encoders and to integrate the beliefs they compute into a consistent posterior approximation. We visualise learned beliefs on a toy dataset and evaluate our methods for learning shared representations and structured output prediction, showing trade-offs of learning separate encoders for each information source. Furthermore, we demonstrate how conflict detection and redundancy can increase robustness of inference in a multi-source setting.

[1]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[2]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[3]  Ruifan Li,et al.  Deep correspondence restricted Boltzmann machine for cross-modal retrieval , 2015, Neurocomputing.

[4]  Juhan Nam,et al.  Multimodal Deep Learning , 2011, ICML.

[5]  Lotfi A. Zadeh,et al.  A Simple View of the Dempster-Shafer Theory of Evidence and Its Implication for the Rule of Combination , 1985, AI Mag..

[6]  Rajesh P. N. Rao,et al.  Learning Shared Latent Structure for Image Synthesis and Robotic Imitation , 2005, NIPS.

[7]  David Duvenaud,et al.  Inference Suboptimality in Variational Autoencoders , 2018, ICML.

[8]  Yong Deng,et al.  Generalized evidence theory , 2014, Applied Intelligence.

[9]  Nicolas Bousquet,et al.  Diagnostics of prior-data agreement in applied Bayesian analysis , 2008 .

[10]  Sidney S. Simon,et al.  Merging of the Senses , 2008, Front. Neurosci..

[11]  Karol Gregor,et al.  Neural Variational Inference and Learning in Belief Networks , 2014, ICML.

[12]  Arthur P. Dempster,et al.  Upper and Lower Probabilities Induced by a Multivalued Mapping , 1967, Classic Works of the Dempster-Shafer Theory of Belief Functions.

[13]  Simon Haykin,et al.  GradientBased Learning Applied to Document Recognition , 2001 .

[14]  Masahiro Suzuki,et al.  Joint Multimodal Learning with Deep Generative Models , 2016, ICLR.

[15]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[16]  Ming-Hsuan Yang,et al.  Max-Margin Boltzmann Machines for Object Segmentation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Noah D. Goodman,et al.  Amortized Inference in Probabilistic Reasoning , 2014, CogSci.

[18]  John Shawe-Taylor,et al.  Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.

[19]  Jeff A. Bilmes,et al.  Deep Canonical Correlation Analysis , 2013, ICML.

[20]  Glenn Shafer,et al.  A Mathematical Theory of Evidence , 2020, A Mathematical Theory of Evidence.

[21]  Mike Wu,et al.  Multimodal Generative Models for Scalable Weakly-Supervised Learning , 2018, NeurIPS.

[22]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[23]  Roger Levy,et al.  A new approach to cross-modal multimedia retrieval , 2010, ACM Multimedia.

[24]  Louis-Philippe Morency,et al.  Multimodal Machine Learning: A Survey and Taxonomy , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Michael I. Jordan,et al.  A Probabilistic Interpretation of Canonical Correlation Analysis , 2005 .

[26]  Ruslan Salakhutdinov,et al.  Importance Weighted Autoencoders , 2015, ICLR.

[27]  Catherine K. Murphy Combining belief functions when evidence conflicts , 2000, Decis. Support Syst..

[28]  Thierry Denoeux,et al.  Conjunctive and disjunctive combination of belief functions induced by nondistinct bodies of evidence , 2008, Artif. Intell..

[29]  Chunhe Xie,et al.  Sensor Data Fusion with Z-Numbers and Its Application in Fault Diagnosis , 2016, Sensors.

[30]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[31]  Fakhri Karray,et al.  Multisensor data fusion: A review of the state-of-the-art , 2013, Inf. Fusion.

[32]  Pietro Perona,et al.  Caltech-UCSD Birds 200 , 2010 .

[33]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[34]  Nitish Srivastava,et al.  Multimodal learning with deep Boltzmann machines , 2012, J. Mach. Learn. Res..

[35]  Kevin Murphy,et al.  Generative Models of Visually Grounded Imagination , 2017, ICLR.