Multimodal Similarity-Preserving Hashing

We introduce an efficient computational framework for hashing data belonging to multiple modalities into a single representation space where they become mutually comparable. The proposed approach is based on a novel coupled siamese neural network architecture and allows unified treatment of intra- and inter-modality similarity learning. Unlike existing cross-modality similarity learning approaches, our hashing functions are not limited to binarized linear projections and can assume arbitrarily complex forms. We show experimentally that our method significantly outperforms state-of-the-art hashing approaches on multimedia retrieval tasks.

[1]  P. Werbos,et al.  Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .

[2]  Richard A. Johnson,et al.  Applied Multivariate Statistical Analysis , 1983 .

[3]  Paul J. Werbos,et al.  Applications of advances in nonlinear sensitivity analysis , 1982 .

[4]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[5]  Yann LeCun,et al.  Signature Verification Using A "Siamese" Time Delay Neural Network , 1993, Int. J. Pattern Recognit. Artif. Intell..

[6]  Jürgen Schmidhuber,et al.  Discovering Predictable Classifications , 1993, Neural Computation.

[7]  Sunil Arya,et al.  An optimal algorithm for approximate nearest neighbor searching fixed dimensions , 1998, JACM.

[8]  Bernhard Schölkopf,et al.  Kernel Principal Component Analysis , 1997, ICANN.

[9]  Patrick J. F. Groenen,et al.  Modern Multidimensional Scaling: Theory and Applications , 2003 .

[10]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[11]  B. Scholkopf,et al.  Fisher discriminant analysis with kernels , 1999, Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop (Cat. No.98TH8468).

[12]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[13]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[14]  Trevor Darrell,et al.  Fast pose estimation with parameter-sensitive hashing , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[15]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[16]  Michael I. Jordan,et al.  Multiple kernel learning, conic duality, and the SMO algorithm , 2004, ICML.

[17]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[18]  Yann LeCun,et al.  Dimensionality Reduction by Learning an Invariant Mapping , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[19]  Stéphane Lafon,et al.  Diffusion maps , 2006 .

[20]  Alexandr Andoni,et al.  Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[21]  Inderjit S. Dhillon,et al.  Information-theoretic metric learning , 2006, ICML '07.

[22]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[23]  Antonio Torralba,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 80 Million Tiny Images: a Large Dataset for Non-parametric Object and Scene Recognition , 2022 .

[24]  Antonio Torralba,et al.  Spectral Hashing , 2008, NIPS.

[25]  Antonio Torralba,et al.  Small codes and large image databases for recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Lei Wang,et al.  Positive Semidefinite Metric Learning with Boosting , 2009, NIPS.

[27]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[28]  Trevor Darrell,et al.  Learning to Hash with Binary Reconstructive Embeddings , 2009, NIPS.

[29]  Kristen Grauman,et al.  Kernelized locality-sensitive hashing for scalable image search , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[30]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[31]  Geoffrey E. Hinton,et al.  Semantic hashing , 2009, Int. J. Approx. Reason..

[32]  Bernhard Schölkopf,et al.  Learning similarity measure for multi-modal 3D image registration , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Gert R. G. Lanckriet,et al.  Partial order embedding with multiple kernels , 2009, ICML '09.

[34]  Roger Levy,et al.  A new approach to cross-modal multimedia retrieval , 2010, ACM Multimedia.

[35]  Shih-Fu Chang,et al.  Sequential Projection Learning for Hashing with Compact Codes , 2010, ICML.

[36]  Jason Weston,et al.  Large scale image annotation: learning to rank with joint word-image embeddings , 2010, Machine Learning.

[37]  Nikos Paragios,et al.  Data fusion through cross-modality metric learning using similarity-sensitive hashing , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[38]  Quoc V. Le,et al.  On optimization methods for deep learning , 2011, ICML.

[39]  Charu C. Aggarwal,et al.  Towards cross-category knowledge propagation for learning visual concepts , 2011, CVPR 2011.

[40]  Leonidas J. Guibas,et al.  Shape google: Geometric words and expressions for invariant shape retrieval , 2011, TOGS.

[41]  David J. Fleet,et al.  Minimal Loss Hashing for Compact Binary Codes , 2011, ICML.

[42]  Christoph Bregler,et al.  Learning invariance through imitation , 2011, CVPR 2011.

[43]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[44]  Shai Avidan,et al.  Coherency Sensitive Hashing , 2011, ICCV.

[45]  Gert R. G. Lanckriet,et al.  Learning Multi-modal Similarity , 2010, J. Mach. Learn. Res..

[46]  Wei Liu,et al.  Hashing with Graphs , 2011, ICML.

[47]  David J. Fleet,et al.  Hamming Distance Metric Learning , 2012, NIPS.

[48]  Sanjiv Kumar,et al.  Angular Quantization-based Binary Codes for Fast Similarity Search , 2012, NIPS.

[49]  David W. Jacobs,et al.  Generalized Multiview Analysis: A discriminative latent space , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[50]  Pascal Fua,et al.  LDAHash: Improved Matching with Smaller Descriptors , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[51]  Rongrong Ji,et al.  Supervised hashing with kernels , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[52]  Jürgen Schmidhuber,et al.  Descriptor Learning for Omnidirectional Image Matching , 2011, Registration and Recognition in Images and Videos.