Discriminative Unsupervised Feature Learning with Convolutional Neural Networks

Current methods for training convolutional neural networks depend on large amounts of labeled samples for supervised training. In this paper we present an approach for training a convolutional neural network using only unlabeled data. We train the network to discriminate between a set of surrogate classes. Each surrogate class is formed by applying a variety of transformations to a randomly sampled 'seed' image patch. We find that this simple feature learning algorithm is surprisingly successful when applied to visual object recognition. The feature representation learned by our algorithm achieves classification results matching or outperforming the current state-of-the-art for unsupervised learning on several popular datasets (STL-10, CIFAR-10, Caltech-101).

[1]  Shenghuo Zhu,et al.  Deep Learning of Invariant Features via Simulated Fixations in Video , 2012, NIPS.

[2]  Yann LeCun,et al.  Tangent Prop - A Formalism for Specifying Selected Invariances in an Adaptive Network , 1991, NIPS.

[3]  Thomas Brox,et al.  Descriptor Matching with Convolutional Neural Networks: a Comparison to SIFT , 2014, ArXiv.

[4]  Massih-Reza Amini,et al.  Semi Supervised Logistic Regression , 2002, ECAI.

[5]  Qiang Chen,et al.  Network In Network , 2013, ICLR.

[6]  Yoshua Bengio,et al.  Entropy Regularization , 2006, Semi-Supervised Learning.

[7]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[8]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[9]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[10]  Harris Drucker,et al.  Improving generalization performance using double backpropagation , 1992, IEEE Trans. Neural Networks.

[11]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[12]  Sida I. Wang,et al.  Dropout Training as Adaptive Regularization , 2013, NIPS.

[13]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[14]  Honglak Lee,et al.  Learning Invariant Representations with Local Transformations , 2012, ICML.

[15]  Nicolas Le Roux,et al.  Ask the locals: Multi-way local pooling for image recognition , 2011, 2011 International Conference on Computer Vision.

[16]  Honglak Lee,et al.  An Analysis of Single-Layer Networks in Unsupervised Feature Learning , 2011, AISTATS.

[17]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[18]  Andrew Y. Ng,et al.  Selecting Receptive Fields in Deep Networks , 2011, NIPS.

[19]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[20]  Y-Lan Boureau,et al.  Learning Convolutional Feature Hierarchies for Visual Recognition , 2010, NIPS.

[21]  Kyunghyun Cho Simple Sparsification Improves Sparse Denoising Autoencoders in Denoising Highly Corrupted Images , 2013, ICML.

[22]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[23]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[24]  Pascal Vincent,et al.  A Generative Process for Contractive Auto-Encoders , 2012, ICML.

[25]  Dieter Fox,et al.  Unsupervised Feature Learning for RGB-D Based Object Recognition , 2012, ISER.

[26]  Yihong Gong,et al.  Training Hierarchical Feed-Forward Visual Recognition Models Using Transfer Learning from Pseudo-Tasks , 2008, ECCV.

[27]  Dieter Fox,et al.  Multipath Sparse Coding Using Hierarchical Matching Pursuit , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Pascal Vincent,et al.  The Manifold Tangent Classifier , 2011, NIPS.

[29]  Ka Yu Hui,et al.  Direct Modeling of Complex Invariances for Visual Object Features , 2013, ICML.

[30]  Jasper Snoek,et al.  Multi-Task Bayesian Optimization , 2013, NIPS.

[31]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Trevor Darrell,et al.  DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.