论文信息 - Are Disentangled Representations Helpful for Abstract Visual Reasoning?

Are Disentangled Representations Helpful for Abstract Visual Reasoning?

A disentangled representation encodes information about the salient factors of variation in the data independently. Although it is often argued that this representational format is useful in learning to solve many real-world down-stream tasks, there is little empirical evidence that supports this claim. In this paper, we conduct a large-scale study that investigates whether disentangled representations are more suitable for abstract reasoning tasks. Using two new tasks similar to Raven's Progressive Matrices, we evaluate the usefulness of the representations learned by 360 state-of-the-art unsupervised disentanglement models. Based on these representations, we train 3600 abstract reasoning models and observe that disentangled representations do in fact lead to better down-stream performance. In particular, they enable quicker learning using fewer samples.

[1] Serge J. Belongie,et al. Bayesian representation learning with oracle constraints , 2015, ICLR 2016.

[2] Aapo Hyvärinen,et al. Nonlinear ICA Using Auxiliary Variables and Generalized Contrastive Learning , 2018, AISTATS.

[3] Seungjin Choi,et al. Independent Component Analysis , 2009, Handbook of Natural Computing.

[4] Michael I. Jordan,et al. Information Constraints on Auto-Encoding Variational Bayes , 2018, NeurIPS.

[5] Jürgen Schmidhuber,et al. Neural Expectation Maximization , 2017, NIPS.

[6] Guillaume Desjardins,et al. Understanding disentangling in β-VAE , 2018, ArXiv.

[7] Bernhard Schölkopf,et al. Elements of Causal Inference: Foundations and Learning Algorithms , 2017 .

[8] Klaus Greff,et al. Multi-Object Representation Learning with Iterative Variational Inference , 2019, ICML.

[9] Gabriel Kreiman,et al. Deep Predictive Coding Networks for Video Prediction and Unsupervised Learning , 2016, ICLR.

[10] Christopher Burgess,et al. beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework , 2016, ICLR 2016.

[11] Joshua B. Tenenbaum,et al. Deep Convolutional Inverse Graphics Network , 2015, NIPS.

[12] Bernhard Schölkopf,et al. The Incomplete Rosetta Stone problem: Identifiability results for Multi-view Nonlinear ICA , 2019, UAI.

[13] Christopher K. I. Williams,et al. A Framework for the Quantitative Evaluation of Disentangled Representations , 2018, ICLR.

[14] Christopher Burgess,et al. DARLA: Improving Zero-Shot Transfer in Reinforcement Learning , 2017, ICML.

[15] Gunnar Rätsch,et al. Competitive Training of Mixtures of Independent Deep Generative Models , 2018 .

[16] J. Raven. STANDARDIZATION OF PROGRESSIVE MATRICES, 1938 , 1941 .

[17] Roger B. Grosse,et al. Isolating Sources of Disentanglement in Variational Autoencoders , 2018, NeurIPS.

[18] David Pfau,et al. Towards a Definition of Disentangled Representations , 2018, ArXiv.

[19] Razvan Pascanu,et al. Discovering objects and their relations from entangled scene representations , 2017, ICLR.

[20] J. Karhunen,et al. Advances in Nonlinear Blind Source Separation , 2003 .

[21] Abhishek Kumar,et al. Variational Inference of Disentangled Latent Concepts from Unlabeled Observations , 2017, ICLR.

[22] Sebastian Nowozin,et al. Multi-Level Variational Autoencoder: Learning Disentangled Representations from Grouped Observations , 2017, AAAI.

[23] Aapo Hyvärinen,et al. Nonlinear independent component analysis: Existence and uniqueness results , 1999, Neural Networks.

[24] Tim Verbelen,et al. Improving Generalization for Abstract Reasoning Tasks Using Disentangled Feature Representations , 2018, NIPS 2018.

[25] Jürgen Schmidhuber,et al. Feature Extraction Through LOCOCODE , 1999, Neural Computation.

[26] Sjoerd van Steenkiste,et al. A Case for Object Compositionality in Deep Generative Models of Images , 2018, ArXiv.

[27] Murray Shanahan,et al. SCAN: Learning Hierarchical Compositional Visual Concepts , 2017, ICLR.

[28] Charles Kemp,et al. The discovery of structural form , 2008, Proceedings of the National Academy of Sciences.

[29] J. Urgen Schmidhuber,et al. Learning Factorial Codes by Predictability Minimization , 1992 .

[30] Yu Zhang,et al. Unsupervised Learning of Disentangled and Interpretable Representations from Sequential Data , 2017, NIPS.

[31] Vighnesh Birodkar,et al. Unsupervised Learning of Disentangled Representations from Video , 2017, NIPS.

[32] Andriy Mnih,et al. Disentangling by Factorising , 2018, ICML.

[33] Pierre Comon,et al. Independent component analysis, A new concept? , 1994, Signal Process..

[34] Yann LeCun,et al. Learning to Linearize Under Uncertainty , 2015, NIPS.

[35] Olivier Bachem,et al. Recent Advances in Autoencoder-Based Representation Learning , 2018, ArXiv.

[36] Jürgen Schmidhuber,et al. Relational Neural Expectation Maximization: Unsupervised Discovery of Objects and their Interactions , 2018, ICLR.

[37] Gunnar Rätsch,et al. SOM-VAE: Interpretable Discrete Representation Learning on Time Series , 2018, ICLR 2018.

[38] Stefan Bauer,et al. Interventional Robustness of Deep Latent Variable Models , 2018, ArXiv.

[39] Stephan Mandt,et al. Disentangled Sequential Autoencoder , 2018, ICML.

[40] Felix Hill,et al. Measuring abstract reasoning in neural networks , 2018, ICML.

[41] Joshua B. Tenenbaum,et al. Understanding Visual Concepts with Continuation Learning , 2016, ArXiv.

[42] Michael I. Jordan,et al. Kernel independent component analysis , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[43] Yoshua Bengio,et al. Scaling learning algorithms towards AI , 2007 .

[44] Jürgen Schmidhuber,et al. Nonlinear ICA through low-complexity autoencoders , 1999, ISCAS'99. Proceedings of the 1999 IEEE International Symposium on Circuits and Systems VLSI (Cat. No.99CH36349).

[45] Bruno A. Olshausen,et al. Discovering Hidden Factors of Variation in Deep Networks , 2014, ICLR.

[46] Bernhard Schölkopf,et al. Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations , 2018, ICML.

[47] Juan Carlos Niebles,et al. Learning to Decompose and Disentangle Representations for Video Prediction , 2018, NeurIPS.

[48] Aapo Hyvärinen,et al. Unsupervised Feature Extraction by Time-Contrastive Learning and Nonlinear ICA , 2016, NIPS.

[49] Joshua B. Tenenbaum,et al. Separating Style and Content with Bilinear Models , 2000, Neural Computation.

[50] Yuting Zhang,et al. Learning to Disentangle Factors of Variation with Manifold Interaction , 2014, ICML.

[51] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..

[52] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[53] Harri Valpola,et al. Tagger: Deep Unsupervised Perceptual Grouping , 2016, NIPS.

[54] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.

[55] Max Welling,et al. Semi-supervised Learning with Deep Generative Models , 2014, NIPS.

[56] Yisong Yue,et al. Factorized Variational Autoencoders for Modeling Audience Reactions to Movies , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).