Bio-Inspired Hashing for Unsupervised Similarity Search

The fruit fly Drosophila's olfactory circuit has inspired a new locality sensitive hashing (LSH) algorithm, FlyHash. In contrast with classical LSH algorithms that produce low dimensional hash codes, FlyHash produces sparse high-dimensional hash codes and has also been shown to have superior empirical performance compared to classical LSH algorithms in similarity search. However, FlyHash uses random projections and cannot learn from data. Building on inspiration from FlyHash and the ubiquity of sparse expansive representations in neurobiology, our work proposes a novel hashing algorithm BioHash that produces sparse high dimensional hash codes in a data-driven manner. We show that BioHash outperforms previously published benchmarks for various hashing methods. Since our learning algorithm is based on a local and biologically plausible synaptic plasticity rule, our work provides evidence for the proposal that LSH might be a computational reason for the abundance of sparse expansive motifs in a variety of biological systems. We also propose a convolutional variant BioConvHash that further improves performance. From the perspective of computer science, BioHash and BioConvHash are fast, scalable and yield compressed binary representations that are useful for similarity search.

[1]  Thomas M. Cover,et al.  Geometrical and Statistical Properties of Systems of Linear Inequalities with Applications in Pattern Recognition , 1965, IEEE Trans. Electron. Comput..

[2]  M. Tsodyks,et al.  The Enhanced Storage Capacity in Neural Networks with Low Activity Level , 1988 .

[3]  William B. Levy,et al.  Energy Efficient Neural Codes , 1996, Neural Computation.

[4]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[5]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[6]  John J. Hopfield,et al.  Neural networks and physical systems with emergent collective computational abilities , 1999 .

[7]  Simon Haykin,et al.  GradientBased Learning Applied to Document Recognition , 2001 .

[8]  Moses Charikar,et al.  Similarity estimation techniques from rounding algorithms , 2002, STOC '02.

[9]  Inderjit S. Dhillon,et al.  Concept Decompositions for Large Sparse Text Data Using Clustering , 2004, Machine Learning.

[10]  Alexandr Andoni,et al.  Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[11]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[12]  Glenn C. Turner,et al.  Olfactory representations by Drosophila mushroom body neurons. , 2008, Journal of neurophysiology.

[13]  Antonio Torralba,et al.  Spectral Hashing , 2008, NIPS.

[14]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[15]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Cordelia Schmid,et al.  Product Quantization for Nearest Neighbor Search , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Jay Yagnik,et al.  The power of comparative reasoning , 2011, 2011 International Conference on Computer Vision.

[18]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[19]  H. Sompolinsky,et al.  Compressed sensing, sparsity, and dimensionality in neuronal information processing and data analysis. , 2012, Annual review of neuroscience.

[20]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[21]  L. Abbott,et al.  Random Convergence of Olfactory Inputs in the Drosophila Mushroom Body , 2013, Nature.

[22]  Glenn C. Turner,et al.  Integration of the olfactory code across dendritic claws of single mushroom body neurons , 2013, Nature Neuroscience.

[23]  Xuelong Li,et al.  Compressed Hashing , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  M. Carandini,et al.  Normalization as a canonical neural computation , 2013, Nature Reviews Neuroscience.

[25]  Heng Tao Shen,et al.  Hashing for Similarity Search: A Survey , 2014, ArXiv.

[26]  Ping Li,et al.  Asymmetric LSH (ALSH) for Sublinear Time Maximum Inner Product Search (MIPS) , 2014, NIPS.

[27]  H. Sompolinsky,et al.  Sparseness and Expansion in Sensory Representations , 2014, Neuron.

[28]  L. Valiant What must a global theory of cortex explain? , 2014, Current Opinion in Neurobiology.

[29]  Andrew C. Lin,et al.  Sparse, Decorrelated Odor Coding in the Mushroom Body Enhances Learned Odor Discrimination , 2014, Nature Neuroscience.

[30]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[31]  Jen-Hao Hsiao,et al.  Deep learning of binary hash codes for fast image retrieval , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[32]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[33]  Nathan Srebro,et al.  On Symmetric and Asymmetric LSHs for Inner Product Search , 2014, ICML.

[34]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[35]  C. Stevens A statistical property of fly odor responses is conserved across odors , 2016, Proceedings of the National Academy of Sciences.

[36]  Ngai-Man Cheung,et al.  Learning to Hash with Binary Deep Neural Network , 2016, ECCV.

[37]  Renjie Liao,et al.  Normalizing the Normalizers: Comparing and Extending Network Normalization Schemes , 2016, ICLR.

[38]  Jinhui Tang,et al.  Discriminative Deep Hashing for Scalable Face Image Retrieval , 2017, IJCAI.

[39]  Olivia Guest,et al.  What the success of brain imaging implies about the neural code , 2016, bioRxiv.

[40]  Jiwen Lu,et al.  Deep Hashing for Scalable Image Search , 2017, IEEE Transactions on Image Processing.

[41]  Eric T. Trautman,et al.  A complete electron microscopy volume of the brain of adult Drosophila melanogaster , 2017 .

[42]  Sanjoy Dasgupta,et al.  A neural algorithm for a fundamental computing problem , 2017 .

[43]  Dmitri B. Chklovskii,et al.  A clustering neural network model of insect olfaction , 2017 .

[44]  Ngai-Man Cheung,et al.  Simultaneous Feature Aggregating and Hashing for Large-Scale Image Search , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Eric T. Trautman,et al.  A Complete Electron Microscopy Volume of the Brain of Adult Drosophila melanogaster , 2017, Cell.

[46]  Sanjoy Dasgupta,et al.  A neural data structure for novelty detection , 2018, Proceedings of the National Academy of Sciences.

[47]  William Kwok-Wai Cheung,et al.  Learning Deep Unsupervised Binary Codes for Image Retrieval , 2018, IJCAI.

[48]  Shuguang Cui,et al.  Fast Similarity Search via Optimal Sparse Lifting , 2018, NeurIPS.

[49]  Saket Navlakha,et al.  Improving Similarity Search with High-dimensional Locality-sensitive Hashing , 2018, ArXiv.

[50]  G. Cecchi,et al.  Modeling Psychotherapy Dialogues with Kernelized Hashcode Representations: A Nonparametric Information-Theoretic Approach. , 2018 .

[51]  Kai Han,et al.  Greedy Hash: Towards Fast Optimization for Accurate Hash Coding in CNN , 2018, NeurIPS.

[52]  Wei Liu,et al.  Semantic Structure-based Unsupervised Deep Hashing , 2018, IJCAI.

[53]  John J. Hopfield,et al.  Unsupervised learning by competing hidden units , 2018, Proceedings of the National Academy of Sciences.

[54]  John J. Hopfield,et al.  Local Unsupervised Learning for Image Analysis , 2019, ArXiv.

[55]  Sheng Jin Unsupervised Semantic Deep Hashing , 2019, Neurocomputing.

[56]  Ngai-Man Cheung,et al.  Compact Hash Code Learning With Binary Deep Neural Network , 2017, IEEE Transactions on Multimedia.