Efficient Nearest Neighbors via Robust Sparse Hashing

This paper presents a new nearest neighbor (NN) retrieval framework: robust sparse hashing (RSH). Our approach is inspired by the success of dictionary learning for sparse coding. Our key idea is to sparse code the data using a learned dictionary, and then to generate hash codes out of these sparse codes for accurate and fast NN retrieval. But, direct application of sparse coding to NN retrieval poses a technical difficulty: when data are noisy or uncertain (which is the case with most real-world data sets), for a query point, an exact match of the hash code generated from the sparse code seldom happens, thereby breaking the NN retrieval. Borrowing ideas from robust optimization theory, we circumvent this difficulty via our novel robust dictionary learning and sparse coding framework called RSH, by learning dictionaries on the robustified counterparts of the perturbed data points. The algorithm is applied to NN retrieval on both simulated and real-world data. Our results demonstrate that RSH holds significant promise for efficient NN retrieval against the state of the art.

[1]  Svetlana Lazebnik,et al.  Locality-sensitive binary codes from shift-invariant kernels , 2009, NIPS.

[2]  Michael Elad,et al.  Image Denoising Via Learned Dictionaries and Sparse representation , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[3]  Guillermo Sapiro,et al.  Hierarchical dictionary learning for invariant classification , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[4]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[5]  P. Jaccard,et al.  Etude comparative de la distribution florale dans une portion des Alpes et des Jura , 1901 .

[6]  Nicole Immorlica,et al.  Locality-sensitive hashing scheme based on p-stable distributions , 2004, SCG '04.

[7]  Michael J. Black,et al.  Fields of Experts: a framework for learning image priors , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[8]  Victor S. Lempitsky,et al.  The Inverted Multi-Index , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Matthijs Douze,et al.  Searching in one billion vectors: Re-rank with source coding , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[10]  Geoffrey E. Hinton,et al.  Semantic hashing , 2009, Int. J. Approx. Reason..

[11]  David Thomas,et al.  The Art in Computer Programming , 2001 .

[12]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[13]  Arkadi Nemirovski,et al.  Robust solutions of uncertain linear programs , 1999, Oper. Res. Lett..

[14]  Cordelia Schmid,et al.  Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search , 2008, ECCV.

[15]  Emmanuel J. Candès,et al.  Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information , 2004, IEEE Transactions on Information Theory.

[16]  Hong Cheng,et al.  Sparsity-Induced Similarity Measure and Its Applications , 2016, IEEE Transactions on Circuits and Systems for Video Technology.

[17]  Gunther Heidemann,et al.  A Sparse Coding Based Similarity Measure , 2009, DMIN.

[18]  Michael J. Todd,et al.  Polynomial Algorithms for Linear Programming , 1988 .

[19]  Guillermo Sapiro,et al.  Online Learning for Matrix Factorization and Sparse Coding , 2009, J. Mach. Learn. Res..

[20]  Xin-She Yang,et al.  Introduction to Algorithms , 2021, Nature-Inspired Optimization Algorithms.

[21]  G. Sapiro,et al.  Universal priors for sparse modeling , 2009, 2009 3rd IEEE International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP).

[22]  Cordelia Schmid,et al.  Product Quantization for Nearest Neighbor Search , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Wei Liu,et al.  Hashing with Graphs , 2011, ICML.

[24]  Donald E. Knuth,et al.  The art of computer programming: sorting and searching (volume 3) , 1973 .

[25]  Christine Guillemot,et al.  SIFT-based local image description using sparse representations , 2009, 2009 IEEE International Workshop on Multimedia Signal Processing.

[26]  Jean-Philippe Vial,et al.  Robust Optimization , 2021, ICORES.

[27]  Kristen Grauman,et al.  Kernelized locality-sensitive hashing for scalable image search , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[28]  Anoop Cherian,et al.  Denoising sparse noise via online dictionary learning , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[29]  Trevor Darrell,et al.  Fast pose estimation with parameter-sensitive hashing , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[30]  Jonathan Goldstein,et al.  When Is ''Nearest Neighbor'' Meaningful? , 1999, ICDT.

[31]  Danny C. Sorensen,et al.  Algorithm 873: LSTRS: MATLAB software for large-scale trust-region subproblems and regularization , 2008, TOMS.

[32]  Yihong Gong,et al.  Locality-constrained Linear Coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[33]  David G. Lowe,et al.  Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration , 2009, VISAPP.

[34]  Antonio Torralba,et al.  Small codes and large image databases for recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[35]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[36]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[37]  Narendra Ahuja,et al.  Hybrid Compressive Sampling via a New Total Variation TVL1 , 2010, ECCV.

[38]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[39]  Antonio Torralba,et al.  Spectral Hashing , 2008, NIPS.

[40]  V. Morellas,et al.  Efficient Similarity Search via Sparse Coding , 2011 .