Sparsity priors and boosting for learning localized distributed feature representations

This technical report presents a study of methods for learning sparse codes and localized features from data. In the context of this study, we propose a new prior for generating sparse image codes with low-energy, localized features. The experiments show that with this prior, it is possible to encode the model with signicantly fewer bits without aecting accuracy. The report also introduces a boosting method for learning the structure and parameters of sparse coding models. The new methods are compared to several existing sparse coding techniques on two tasks: reconstruction of natural image patches and self taught learning. The experiments examine the eect of structural choices, priors and dataset size on model size and performance. Interestingly, we discover that, for sparse coding, it is possible to obtain more compact models without incurring reconstruction errors by simply increasing the dataset size.

[1]  Stan Z. Li,et al.  Learning spatially localized, parts-based representation , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[2]  Yoshua Bengio,et al.  Exploring Strategies for Training Deep Neural Networks , 2009, J. Mach. Learn. Res..

[3]  Rajat Raina,et al.  Efficient sparse coding algorithms , 2006, NIPS.

[4]  Patrik O. Hoyer,et al.  Non-negative Matrix Factorization with Sparseness Constraints , 2004, J. Mach. Learn. Res..

[5]  Honglak Lee,et al.  Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations , 2009, ICML '09.

[6]  D. George,et al.  Hierarchical Temporal Memory Concepts , Theory , and Terminology , 2006 .

[7]  David M. Bradley,et al.  Convex Coding , 2009, UAI.

[8]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[9]  David J. Field,et al.  Sparse coding with an overcomplete basis set: A strategy employed by V1? , 1997, Vision Research.

[10]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[11]  Rajat Raina,et al.  Self-taught learning: transfer learning from unlabeled data , 2007, ICML '07.

[12]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[13]  David J. Field,et al.  Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[14]  P. Paatero,et al.  Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values† , 1994 .

[15]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[16]  Thomas Hofmann,et al.  Greedy Layer-Wise Training of Deep Networks , 2007 .

[17]  Honglak Lee,et al.  Sparse deep belief net model for visual area V2 , 2007, NIPS.

[18]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[19]  Geoffrey E. Hinton,et al.  Deep Boltzmann Machines , 2009, AISTATS.

[20]  Terrence J. Sejnowski,et al.  The “independent components” of natural scenes are edge filters , 1997, Vision Research.

[21]  Aapo Hyvärinen,et al.  Natural Image Statistics - A Probabilistic Approach to Early Computational Vision , 2009, Computational Imaging and Vision.

[22]  Cynthia Rudin,et al.  On the Dynamics of Boosting , 2003, NIPS.

[23]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[24]  Yoshua Bengio,et al.  Scaling learning algorithms towards AI , 2007 .

[25]  Michael S. Lewicki,et al.  Emergence of complex cell properties by learning to generalize in natural scenes , 2009, Nature.

[26]  Bruno A. Olshausen,et al.  Sparse Codes and Spikes , 2001 .

[27]  Richard G. Baraniuk,et al.  Sparse Coding via Thresholding and Local Competition in Neural Circuits , 2008, Neural Computation.

[28]  Mark W. Schmidt,et al.  Fast Optimization Methods for L1 Regularization: A Comparative Study and Two New Approaches , 2007, ECML.

[29]  Marc'Aurelio Ranzato,et al.  Fast Inference in Sparse Coding Algorithms with Applications to Object Recognition , 2010, ArXiv.

[30]  Hilbert J. Kappen Deterministic learning rules for boltzmann machines , 1995, Neural Networks.

[31]  Bruno A. Olshausen,et al.  Learning Horizontal Connections in a Sparse Coding Model of Natural Images , 2007, NIPS.

[32]  Peter Bühlmann,et al.  Robustified L2 boosting , 2008, Comput. Stat. Data Anal..

[33]  Peter Bühlmann,et al.  Boosting, model selection, lasso and nonnegative garrote , 2005 .