Building predictive models on complex symbolic sequences with a second-order recurrent BCM network with lateral inhibition

Activation patterns across recurrent units in recurrent neural networks (RNNs) can be thought of as spatial codes of the history of inputs seen so far. When trained on symbolic sequences to perform the next-symbol prediction, RNNs tend to organize their state space so that "c10se'~ recur- rent activation vectors correspond to histories of symbols yielding similar next-symbol distributions (l). This leads to simple finite-context predictive models built on top of recurrent activations by grouping close activation patterns via a vector quantization. In this paper we investigate an unsu- pervised alternative to the state space organization. In particular, we use a recurrent version of the Bienenstock, Cooper and Munro (BCM) network with lateral inhibition (2) to map histories of symbols into activations of the recurrent layer. Recurrent BCM networks perform a kind of time-conditional projection pursuit. We compare the finite-context models built on top of BCM recurrent activations with1 those constructed on top of RNN recurrent activation vectors. As a test bed we use two complex symbolic sequences with rather deep memory structures. Surprisingly, the BCM-based model has a comparable or better performance than its RNN-based counterpart. This can be explained by the familiar information latching problem in recurrent networks when longer time spans are to be latched (3,41.

[1]  M E Diamond,et al.  Dynamic synaptic modification threshold: computational model of experience-dependent plasticity in adult rat barrel cortex. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[2]  Nathan Intrator,et al.  Objective function formulation of the BCM theory of visual cortical plasticity: Statistical connections, stability conditions , 1992, Neural Networks.

[3]  Peter Tiño,et al.  Spatial representation of symbolic sequences through iterative function systems , 1999, IEEE Trans. Syst. Man Cybern. Part A.

[4]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[5]  Yoshua Bengio,et al.  Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[6]  Peter Tiño,et al.  Extracting finite-state representations from recurrent neural networks trained on chaotic symbolic sequences , 1999, IEEE Trans. Neural Networks.

[7]  Nathan Intrator,et al.  Multimodality exploration by an unsupervised projection pursuit neural network , 1998, IEEE Trans. Neural Networks.

[8]  Ebeling,et al.  Self-similar sequences and universal scaling of dynamical entropies. , 1996, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[9]  Scott A. Musman,et al.  Unsupervised BCM projection pursuit algorithms for classification of simulated radar presentations , 1994, Neural Networks.