No Representation without Transformation

We extend the framework of variational autoencoders to represent transformations explicitly in the latent space. In the family of hierarchical graphical models that emerges, the latent space is populated by higher order objects that are inferred jointly with the latent representations they act on. To explicitly demonstrate the effect of these higher order objects, we show that the inferred latent transformations reflect interpretable properties in the observation space. Furthermore, the model is structured in such a way that in the absence of transformations, we can run inference and obtain generative capabilities comparable with standard variational autoencoders. Finally, utilizing the trained encoder, we outperform the baselines by a wide margin on a challenging out-of-distribution classification task.

[1]  Joshua B. Tenenbaum,et al.  Human-level concept learning through probabilistic program induction , 2015, Science.

[2]  Ankush Gupta,et al.  Unsupervised Learning of Object Keypoints for Perception and Control , 2019, NeurIPS.

[3]  Stefano Ermon,et al.  InfoVAE: Balancing Learning and Inference in Variational Autoencoders , 2019, AAAI.

[4]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[5]  Pascal Libuschewski,et al.  Group Equivariant Capsule Networks , 2018, NeurIPS.

[6]  Jonathan Tompson,et al.  Temporal Cycle-Consistency Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Max Welling,et al.  VAE with a VampPrior , 2017, AISTATS.

[8]  Amos J. Storkey,et al.  Towards a Neural Statistician , 2016, ICLR.

[9]  Geoffrey E. Hinton,et al.  Using matrices to model symbolic relationship , 2008, NIPS.

[10]  Tomaso Poggio,et al.  Representation Learning in Sensory Cortex: A Theory , 2014, IEEE Access.

[11]  Bernhard Schölkopf,et al.  A Kernel Method for the Two-Sample-Problem , 2006, NIPS.

[12]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[13]  Max Welling,et al.  Group Equivariant Convolutional Networks , 2016, ICML.

[14]  Martin A. Riedmiller,et al.  Embed to Control: A Locally Linear Latent Dynamics Model for Control from Raw Images , 2015, NIPS.

[15]  Chong Wang,et al.  Stochastic variational inference , 2012, J. Mach. Learn. Res..

[16]  Koray Kavukcuoglu,et al.  Neural scene representation and rendering , 2018, Science.

[17]  David Filliat,et al.  Symmetry-Based Disentangled Representation Learning requires Interaction with Environments , 2019, NeurIPS.

[18]  H. Weyl Quantenmechanik und Gruppentheorie , 1927 .

[19]  Jürgen Schmidhuber,et al.  Learning Factorial Codes by Predictability Minimization , 1992, Neural Computation.

[20]  Rémi Munos,et al.  Neural Predictive Belief Representations , 2018, ArXiv.

[21]  Bernhard Schölkopf,et al.  Elements of Causal Inference: Foundations and Learning Algorithms , 2017 .

[22]  Geoffrey E. Hinton,et al.  Transforming Auto-Encoders , 2011, ICANN.

[23]  Stéphane Mallat,et al.  Group Invariant Scattering , 2011, ArXiv.

[24]  Felix . Klein,et al.  Vergleichende Betrachtungen über neuere geometrische Forschungen , 1893 .

[25]  Geoffrey E. Hinton,et al.  Unsupervised Learning of Image Transformations , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  David P. Wipf,et al.  Diagnosing and Enhancing VAE Models , 2019, ICLR.

[27]  Maximilian Karl,et al.  Deep Variational Bayes Filters: Unsupervised Learning of State Space Models from Raw Data , 2016, ICLR.

[28]  Thomas Serre,et al.  Models of visual cortex , 2013, Scholarpedia.

[29]  Oriol Vinyals,et al.  Neural Discrete Representation Learning , 2017, NIPS.

[30]  Stéphane Mallat,et al.  Invariant Scattering Convolution Networks , 2012, IEEE transactions on pattern analysis and machine intelligence.

[31]  Roland Memisevic,et al.  Modeling Deep Temporal Dependencies with Recurrent "Grammar Cells" , 2014, NIPS.

[32]  Roland Vollgraf,et al.  Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[33]  D C Van Essen,et al.  Shifter circuits: a computational strategy for dynamic aspects of visual processing. , 1987, Proceedings of the National Academy of Sciences of the United States of America.

[34]  David Pfau,et al.  Towards a Definition of Disentangled Representations , 2018, ArXiv.

[35]  John Shawe-Taylor,et al.  Representation Theory and Invariant Neural Networks , 1996, Discret. Appl. Math..

[36]  Nicola De Cao,et al.  Explorations in Homeomorphic Variational Auto-Encoding , 2018, ArXiv.

[37]  Sergey Levine,et al.  Dynamics-Aware Unsupervised Discovery of Skills , 2019, ICLR.

[38]  Karol Gregor,et al.  Temporal Difference Variational Auto-Encoder , 2018, ICLR.

[39]  Yee Whye Teh,et al.  Probabilistic symmetry and invariant neural networks , 2019, J. Mach. Learn. Res..

[40]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[41]  Bernhard Schölkopf,et al.  Causality for Machine Learning , 2019, ArXiv.

[42]  Andriy Mnih,et al.  Disentangling by Factorising , 2018, ICML.

[43]  Roland Memisevic,et al.  On multi-view feature learning , 2012, ICML.

[44]  Geoffrey E. Hinton A Parallel Computation that Assigns Canonical Object-Based Frames of Reference , 1981, IJCAI.

[45]  Soren Hauberg,et al.  Explicit Disentanglement of Appearance and Perspective in Generative Models , 2019, NeurIPS.

[46]  Oriol Vinyals,et al.  Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.

[47]  Tomaso Poggio,et al.  I-theory on depth vs width: hierarchical function composition , 2015 .

[48]  Alex Graves,et al.  DRAW: A Recurrent Neural Network For Image Generation , 2015, ICML.

[49]  Aapo Hyvärinen,et al.  Variational Autoencoders and Nonlinear ICA: A Unifying Framework , 2019, AISTATS.

[50]  Richard S. Zemel,et al.  Prototypical Networks for Few-shot Learning , 2017, NIPS.

[51]  Bernhard Schölkopf,et al.  Learning Independent Causal Mechanisms , 2017, ICML.

[52]  Bernhard Schölkopf,et al.  Wasserstein Auto-Encoders , 2017, ICLR.

[53]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[54]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[55]  E. Wigner Gruppentheorie und ihre Anwendung auf die Quantenmechanik der Atomspektren , 1931 .

[56]  Bruno A Olshausen,et al.  Sparse coding of sensory inputs , 2004, Current Opinion in Neurobiology.

[57]  David Marr,et al.  Vision: A computational investigation into the human representation , 1983 .

[58]  Pieter Abbeel,et al.  Gradient Estimation Using Stochastic Computation Graphs , 2015, NIPS.

[59]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[60]  Geoffrey E. Hinton,et al.  Dynamic Routing Between Capsules , 2017, NIPS.

[61]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[62]  Aaron van den Oord,et al.  Shaping Belief States with Generative Environment Models for RL , 2019, NeurIPS.

[63]  Berthold Schweizer,et al.  Probabilistic Metric Spaces , 2011 .

[64]  Andrew Zisserman,et al.  Learning to Discover Novel Visual Categories via Deep Transfer Clustering , 2019 .

[65]  Max Welling,et al.  Transformation Properties of Learned Visual Representations , 2014, ICLR.

[66]  Shakir Mohamed,et al.  Learning in Implicit Generative Models , 2016, ArXiv.

[67]  D. Marr,et al.  Representation and recognition of the spatial organization of three-dimensional shapes , 1978, Proceedings of the Royal Society of London. Series B. Biological Sciences.

[68]  Yee Whye Teh,et al.  Neural Processes , 2018, ArXiv.

[69]  Yoshua Bengio,et al.  Learning Independent Features with Adversarial Nets for Non-linear ICA , 2017, 1710.05050.

[70]  Padhraic Smyth,et al.  Learning Priors for Invariance , 2018, AISTATS.

[71]  Alex Graves,et al.  Associative Compression Networks for Representation Learning , 2018, ArXiv.

[72]  Yining Chen,et al.  Weakly Supervised Disentanglement with Guarantees , 2020, ICLR.

[73]  Bernhard Schölkopf,et al.  Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations , 2018, ICML.

[74]  Stefano Ermon,et al.  Neural Variational Inference and Learning in Undirected Graphical Models , 2017, NIPS.

[75]  Søren Hauberg,et al.  Only Bayes should learn a manifold (on the estimation of differential geometric structure from data) , 2018, ArXiv.

[76]  Noah D. Goodman,et al.  Amortized Inference in Probabilistic Reasoning , 2014, CogSci.

[77]  Stefano Soatto,et al.  Emergence of Invariance and Disentanglement in Deep Representations , 2017, 2018 Information Theory and Applications Workshop (ITA).

[78]  Roger B. Grosse,et al.  Isolating Sources of Disentanglement in Variational Autoencoders , 2018, NeurIPS.