Resonator Circuits for factoring high-dimensional vectors

We describe a type of neural network, called a Resonator Circuit, that factors high-dimensional vectors. Given a composite vector formed by the Hadamard product of several other vectors drawn from a discrete set, a Resonator Circuit can efficiently decompose the composite into these factors. This paper focuses on the case of "bipolar" vectors whose elements are $\pm1$ and characterizes the solution quality, stability properties, and speed of Resonator Circuits in comparison to several benchmark optimization methods including Alternating Least Squares, Iterative Soft Thresholding, and Multiplicative Weights. We find that Resonator Circuits substantially outperform these alternative methods by leveraging a combination of powerful nonlinear dynamics and "searching in superposition", by which we mean that estimates of the correct solution are, at any given time, formed from a weighted superposition of all possible solutions. The considered alternative methods also search in superposition, but the dynamics of Resonator Circuits allow them to strike a more natural balance between exploring the solution space and exploiting local information to drive the network toward probable solutions. Resonator Circuits can be conceptualized as a set of interconnected Hopfield Networks, and this leads to some interesting analysis. In particular, while a Hopfield Network descends an energy function and is guaranteed to converge, a Resonator Circuit is not. However, there exists a high-fidelity regime where Resonator Circuits almost always do converge, and they can solve the factorization problem extremely well. As factorization is central to many aspects of perception and cognition, we believe that Resonator Circuits may bring us a step closer to understanding how this computationally difficult problem is efficiently solved by neural circuits in brains.

[1]  Richard A. Harshman,et al.  Foundations of the PARAFAC procedure: Models and conditions for an "explanatory" multi-model factor analysis , 1970 .

[2]  Geoffrey E. Hinton A Parallel Computation that Assigns Canonical Object-Based Frames of Reference , 1981, IJCAI.

[3]  Joos Vandewalle,et al.  A Multilinear Singular Value Decomposition , 2000, SIAM J. Matrix Anal. Appl..

[4]  Edward H. Adelson,et al.  The perception of shading and reflectance , 1996 .

[5]  I. Daubechies,et al.  An iterative thresholding algorithm for linear inverse problems with a sparsity constraint , 2003, math/0307152.

[6]  Tomás Gedeon,et al.  Analysis of Constrained Optimization Variants of the Map-Seeking Circuit Algorithm , 2007, Journal of Mathematical Imaging and Vision.

[7]  Pentti Kanerva,et al.  Binary Spatter-Coding of Ordered K-Tuples , 1996, ICANN.

[8]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[9]  K. Bredies,et al.  Linear Convergence of Iterative Soft-Thresholding , 2007, 0709.1598.

[10]  Frank Rosenblatt,et al.  PRINCIPLES OF NEURODYNAMICS. PERCEPTRONS AND THE THEORY OF BRAIN MECHANISMS , 1963 .

[11]  Ross W. Gayler Multiplicative Binding, Representation Operators & Analogy (Workshop Poster) , 1998 .

[12]  H. Barrow,et al.  RECOVERING INTRINSIC SCENE CHARACTERISTICS FROM IMAGES , 1978 .

[13]  Joshua B. Tenenbaum,et al.  Separating Style and Content with Bilinear Models , 2000, Neural Computation.

[14]  Fredrik Sandin,et al.  Analogical mapping and inference with binary spatter codes and sparse distributed memory , 2013, The 2013 International Joint Conference on Neural Networks (IJCNN).

[15]  D. W. Arathorn Recognition under transformation using superposition ordering property , 2001 .

[16]  Laurent Condat Fast projection onto the simplex and the l1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pmb {l}_\mathbf {1}$$\end{ , 2015, Mathematical Programming.

[17]  Luca Benini,et al.  Efficient Biosignal Processing Using Hyperdimensional Computing: Network Templates for Combined Learning and Classification of ExG Signals , 2019, Proceedings of the IEEE.

[18]  Trevor Bekolay,et al.  Supplementary Materials for A Large-Scale Model of the Functioning Brain , 2012 .

[19]  Masaki Kobayashi,et al.  Multidirectional associative memory with a hidden layer , 2002, Systems and Computers in Japan.

[20]  Geoffrey E. Hinton Mapping Part-Whole Hierarchies into Connectionist Networks , 1990, Artif. Intell..

[21]  Tony Plate,et al.  Holographic Reduced Representations: Convolution Algebra for Compositional Distributed Representations , 1991, IJCAI.

[22]  Yoram Singer,et al.  Efficient projections onto the l1-ball for learning in high dimensions , 2008, ICML '08.

[23]  Tony A. Plate,et al.  Holographic reduced representations , 1995, IEEE Trans. Neural Networks.

[24]  Philip Wolfe,et al.  Validation of subgradient optimization , 1974, Math. Program..

[25]  Bruno A. Olshausen,et al.  Learning Intermediate-Level Representations of Form and Motion from Natural Movies , 2012, Neural Computation.

[26]  E. Capaldi,et al.  The organization of behavior. , 1992, Journal of applied behavior analysis.

[27]  J. Kruskal Three-way arrays: rank and uniqueness of trilinear decompositions, with application to arithmetic complexity and statistics , 1977 .

[28]  Geoffrey E. Hinton Tensor Product Variable Binding and the Representation of Symbolic Structures in Connectionist Systems , 1991 .

[29]  J. Chang,et al.  Analysis of individual differences in multidimensional scaling via an n-way generalization of “Eckart-Young” decomposition , 1970 .

[30]  F. L. Hitchcock The Expression of a Tensor or a Polyadic as a Sum of Products , 1927 .

[31]  Tomás Gedeon,et al.  Convergence of Map Seeking Circuits , 2007, Journal of Mathematical Imaging and Vision.

[32]  David W. Arathorn,et al.  Map-Seeking Circuits in Visual Cognition: A Computational Mechanism for Biological and Machine Vision , 2002 .

[33]  Ross W. Gayler Vector Symbolic Architectures answer Jackendoff's challenges for cognitive neuroscience , 2004, ArXiv.

[34]  N. Sidiropoulos,et al.  On the uniqueness of multilinear decomposition of N‐way arrays , 2000 .

[35]  Friedrich T. Sommer,et al.  Robust computation with rhythmic spike patterns , 2019, Proceedings of the National Academy of Sciences.

[36]  Lorenzo Rosasco,et al.  Holographic Embeddings of Knowledge Graphs , 2015, AAAI.

[37]  David S. Touretzky,et al.  BoltzCONS: Dynamic Symbol Structures in a Connectionist Network , 1990, Artif. Intell..

[38]  Pentti Kanerva,et al.  Large Patterns Make Great Symbols: An Example of Learning from Example , 1998, Hybrid Neural Systems.

[39]  Pentti Kanerva,et al.  Hyperdimensional Computing: An Introduction to Computing in Distributed Representation with High-Dimensional Random Vectors , 2009, Cognitive Computation.

[40]  Sanjeev Arora,et al.  The Multiplicative Weights Update Method: a Meta-Algorithm and Applications , 2012, Theory Comput..

[41]  Gábor Lugosi,et al.  Concentration Inequalities , 2008, COLT.

[42]  J J Hopfield,et al.  Neurons with graded response have collective computational properties like those of two-state neurons. , 1984, Proceedings of the National Academy of Sciences of the United States of America.

[43]  BART KOSKO,et al.  Bidirectional associative memories , 1988, IEEE Trans. Syst. Man Cybern..

[44]  Sompolinsky,et al.  Information storage in neural networks with low levels of activity. , 1987, Physical review. A, General physics.

[45]  Aditya Joshi,et al.  Language Geometry Using Random Indexing , 2016, QI.

[46]  J. Raven,et al.  Raven Progressive Matrices , 2003 .

[47]  J. Knott The organization of behavior: A neuropsychological theory , 1951 .

[48]  Tony A. Plate,et al.  Holographic Reduced Representation: Distributed Representation for Cognitive Structures , 2003 .

[49]  H. Sompolinsky,et al.  Chaos in Neuronal Networks with Balanced Excitatory and Inhibitory Activity , 1996, Science.

[50]  Jan M. Rabaey,et al.  High-Dimensional Computing as a Nanoscalable Paradigm , 2017, IEEE Transactions on Circuits and Systems I: Regular Papers.

[51]  J J Hopfield,et al.  Neural networks and physical systems with emergent collective computational abilities. , 1982, Proceedings of the National Academy of Sciences of the United States of America.

[52]  Tamara G. Kolda,et al.  Tensor Decompositions and Applications , 2009, SIAM Rev..

[53]  A. Yuille,et al.  Object perception as Bayesian inference. , 2004, Annual review of psychology.

[54]  J. Kruskal,et al.  Candelinc: A general approach to multidimensional analysis of many-way arrays with linear constraints on parameters , 1980 .

[55]  Sompolinsky,et al.  Storing infinite numbers of patterns in a spin-glass model of neural networks. , 1985, Physical review letters.

[56]  Geoffrey E. Hinton,et al.  Learning to Represent Spatial Transformations with Factored Higher-Order Boltzmann Machines , 2010, Neural Computation.

[57]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[58]  Personnaz,et al.  Collective computational properties of neural networks: New learning mechanisms. , 1986, Physical review. A, General physics.

[59]  Nikos D. Sidiropoulos,et al.  Cramer-Rao lower bounds for low-rank decomposition of multidimensional arrays , 2001, IEEE Trans. Signal Process..

[60]  Demetri Terzopoulos,et al.  Multilinear Analysis of Image Ensembles: TensorFaces , 2002, ECCV.

[61]  D. V. van Essen,et al.  A neurobiological model of visual attention and invariant pattern recognition based on dynamic routing of information , 1993, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[62]  D. Mumford,et al.  Pattern Theory: The Stochastic Analysis of Real-World Signals , 2010 .

[63]  Elad Hazan,et al.  Introduction to Online Convex Optimization , 2016, Found. Trends Optim..

[64]  L. Tucker,et al.  Some mathematical notes on three-mode factor analysis , 1966, Psychometrika.

[65]  Shun-ichi Amari,et al.  Statistical neurodynamics of associative memory , 1988, Neural Networks.