论文信息 - Group Sparse Coding

Group Sparse Coding

Bag-of-words document representations are often used in text, image and video processing. While it is relatively easy to determine a suitable word dictionary for text documents, there is no simple mapping from raw images or videos to dictionary terms. The classical approach builds a dictionary using vector quantization over a large set of useful visual descriptors extracted from a training set, and uses a nearest-neighbor algorithm to count the number of occurrences of each dictionary word in documents to be encoded. More robust approaches have been proposed recently that represent each visual descriptor as a sparse weighted combination of dictionary words. While favoring a sparse representation at the level of visual descriptors, those methods however do not ensure that images have sparse representation. In this work, we use mixed-norm regularization to achieve sparsity at the image level as well as a small overall dictionary. This approach can also be used to encourage using the same dictionary words for all the images in a class, providing a discriminative signal in the construction of image representations. Experimental results on a benchmark image classification dataset show that when compact image or dictionary representations are needed for computational efficiency, the proposed approach yields better mean average precision in classification.

[1] David J. Field,et al. Sparse coding with an overcomplete basis set: A strategy employed by V1? , 1997, Vision Research.

[2] David G. Lowe,et al. Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[3] Luc Van Gool,et al. Modeling scenes with local descriptors and latent aspects , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[4] Jianguo Zhang,et al. The PASCAL Visual Object Classes Challenge , 2006 .

[5] David Nistér,et al. Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[6] Michael Elad,et al. Image Denoising Via Sparse and Redundant Representations Over Learned Dictionaries , 2006, IEEE Transactions on Image Processing.

[7] Rajat Raina,et al. Efficient sparse coding algorithms , 2006, NIPS.

[8] G. Obozinski. Joint covariate selection for grouped classification , 2007 .

[9] Martin J. Wainwright,et al. Phase transitions for high-dimensional joint support recovery , 2008, NIPS.

[10] Vincent Lepetit,et al. A fast local descriptor for dense matching , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[11] Michael Elad,et al. Sparse Representation for Color Image Restoration , 2008, IEEE Transactions on Image Processing.

[12] Martial Hebert,et al. Discriminative Sparse Image Models for Class-Specific Edge Detection and Image Interpretation , 2008, ECCV.

[13] Yoram Singer,et al. Boosting with structural sparsity , 2009, ICML '09.

[14] Guillermo Sapiro,et al. Online dictionary learning for sparse coding , 2009, ICML '09.

[15] Yihong Gong,et al. Linear spatial pyramid matching using sparse coding for image classification , 2009, CVPR.