论文信息 - Learning about an exponential amount of conditional distributions

Learning about an exponential amount of conditional distributions

We introduce the Neural Conditioner (NC), a self-supervised machine able to learn about all the conditional distributions of a random vector $X$. The NC is a function $NC(x \cdot a, a, r)$ that leverages adversarial training to match each conditional distribution $P(X_r|X_a=x_a)$. After training, the NC generalizes to sample from conditional distributions never seen, including the joint distribution. The NC is also able to auto-encode examples, providing data representations useful for downstream classification tasks. In sum, the NC integrates different self-supervised tasks (each being the estimation of a conditional distribution) and levels of supervision (partially observed data) seamlessly into a single learning experience.

[1] Alexei A. Efros,et al. Colorful Image Colorization , 2016, ECCV.

[2] Alexander A. Alemi,et al. Deep Variational Information Bottleneck , 2017, ICLR.

[3] C. Brody. Linear Heteroencoders , 1999 .

[4] Hugo Larochelle,et al. A Deep and Tractable Density Estimator , 2013, ICML.

[5] Peter Bühlmann,et al. MissForest - non-parametric missing value imputation for mixed-type data , 2011, Bioinform..

[6] Andrew Zisserman,et al. Multi-task Self-Supervised Visual Learning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[7] Andrew Owens,et al. Ambient Sound Provides Supervision for Visual Learning , 2016, ECCV.

[8] Soumith Chintala,et al. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[9] Thomas M. Cover,et al. Elements of Information Theory , 2005 .

[10] Aaron C. Courville,et al. Hierarchical Adversarially Learned Inference , 2018, ArXiv.

[11] Paolo Favaro,et al. Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles , 2016, ECCV.

[12] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[13] Ole Winther,et al. Auxiliary Deep Generative Models , 2016, ICML.

[14] Matthijs Douze,et al. Deep Clustering for Unsupervised Learning of Visual Features , 2018, ECCV.

[15] Surya Ganguli,et al. Deep Unsupervised Learning using Nonequilibrium Thermodynamics , 2015, ICML.

[16] Yoshua Bengio,et al. Mutual Information Neural Estimation , 2018, ICML.

[17] Aaron C. Courville,et al. Improved Training of Wasserstein GANs , 2017, NIPS.

[18] Takeru Miyato,et al. cGANs with Projection Discriminator , 2018, ICLR.

[19] Geoffrey E. Hinton. Reducing the Dimensionality of Data with Neural , 2008 .

[20] Pascal Vincent,et al. Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..

[21] Yuichi Yoshida,et al. Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[22] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..

[23] Aníbal R. Figueiras-Vidal,et al. Pattern classification with missing data: a review , 2010, Neural Computing and Applications.

[24] Alexander J. Smola,et al. Hilbert space embeddings of conditional distributions with applications to dynamical systems , 2009, ICML '09.

[25] Vladimir Vapnik,et al. Statistical learning theory , 1998 .

[26] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.

[27] Chen Huang,et al. Learning Deep Representation for Imbalanced Classification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[29] Geoffrey E. Hinton,et al. Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[30] Alexei A. Efros,et al. Unsupervised Visual Representation Learning by Context Prediction , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[31] Bernhard Schölkopf,et al. Kernel Mean Embedding of Distributions: A Review and Beyonds , 2016, Found. Trends Mach. Learn..

[32] Xiaogang Wang,et al. Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[33] Sebastian Nowozin,et al. Stabilizing Training of Generative Adversarial Networks through Regularization , 2017, NIPS.

[34] Sina Honari,et al. Learning to Generate Samples from Noise through Infusion Training , 2017, ICLR.

[35] Dmitry Vetrov,et al. Variational Autoencoder with Arbitrary Conditioning , 2018, ICLR.

[36] Terrence J. Sejnowski,et al. Unsupervised Learning , 2018, Encyclopedia of GIS.

[37] Max Welling,et al. Semi-supervised Learning with Deep Generative Models , 2014, NIPS.

[38] Trevor Darrell,et al. PANDA: Pose Aligned Networks for Deep Attribute Modeling , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[39] Alexander Kolesnikov,et al. Revisiting Self-Supervised Visual Representation Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[40] Surya Ganguli,et al. Variational Walkback: Learning a Transition Operator as a Stochastic Recurrent Net , 2017, NIPS.

[41] Thomas Brox,et al. U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[42] Lovedeep Gondara,et al. Multiple Imputation Using Deep Denoising Autoencoders , 2017, ArXiv.

[43] James Philbin,et al. FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44] Peter N. Belhumeur,et al. POOF: Part-Based One-vs.-One Features for Fine-Grained Categorization, Face Verification, and Attribute Estimation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[45] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[46] Guy Lever,et al. Conditional mean embeddings as regressors , 2012, ICML.

[47] Mihaela van der Schaar,et al. GAIN: Missing Data Imputation using Generative Adversarial Nets , 2018, ICML.

[48] Alexei A. Efros,et al. Context Encoders: Feature Learning by Inpainting , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[49] Hugo Larochelle,et al. MADE: Masked Autoencoder for Distribution Estimation , 2015, ICML.

[50] Nikos Komodakis,et al. Unsupervised Representation Learning by Predicting Image Rotations , 2018, ICLR.

[51] Jonathan Tompson,et al. Unsupervised Learning of Spatiotemporally Coherent Metrics , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[52] Patrick Royston,et al. Multiple Imputation by Chained Equations (MICE): Implementation in Stata , 2011 .

[53] Boqing Gong,et al. Improving Facial Attribute Prediction Using Semantic Segmentation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[54] Ian T. Jolliffe,et al. Principal Component Analysis , 2002, International Encyclopedia of Statistical Science.

[55] Ting-Chun Wang,et al. Image Inpainting for Irregular Holes Using Partial Convolutions , 2018, ECCV.

[56] Hossein Mobahi,et al. Deep learning from temporal coherence in video , 2009, ICML '09.

[57] Samy Bengio,et al. Taking on the curse of dimensionality in joint distributions using neural networks , 2000, IEEE Trans. Neural Networks Learn. Syst..

[58] Maria L. Rizzo,et al. Measuring and testing dependence by correlation of distances , 2007, 0803.4101.

[59] Hugo Larochelle,et al. The Neural Autoregressive Distribution Estimator , 2011, AISTATS.

[60] Koray Kavukcuoglu,et al. Pixel Recurrent Neural Networks , 2016, ICML.

[61] Ole Winther,et al. Autoencoding beyond pixels using a learned similarity metric , 2015, ICML.

[62] Robert Tibshirani,et al. Spectral Regularization Algorithms for Learning Large Incomplete Matrices , 2010, J. Mach. Learn. Res..

[63] Abhinav Gupta,et al. Unsupervised Learning of Visual Representations Using Videos , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[64] Yura N. Perov,et al. A Universal Marginalizer for Amortized Inference in Generative Models , 2017, ArXiv.

[65] Bernhard Schölkopf,et al. A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..

[66] Surya Ganguli,et al. Exact solutions to the nonlinear dynamics of learning in deep linear neural networks , 2013, ICLR.

[67] Kurt Hornik,et al. Neural networks and principal component analysis: Learning from examples without local minima , 1989, Neural Networks.