Integrating probabilistic models of perception and interactive neural networks: a historical and tutorial review

This article seeks to establish a rapprochement between explicitly Bayesian models of contextual effects in perception and neural network models of such effects, particularly the connectionist interactive activation (IA) model of perception. The article is in part an historical review and in part a tutorial, reviewing the probabilistic Bayesian approach to understanding perception and how it may be shaped by context, and also reviewing ideas about how such probabilistic computations may be carried out in neural networks, focusing on the role of context in interactive neural networks, in which both bottom-up and top-down signals affect the interpretation of sensory inputs. It is pointed out that connectionist units that use the logistic or softmax activation functions can exactly compute Bayesian posterior probabilities when the bias terms and connection weights affecting such units are set to the logarithms of appropriate probabilistic quantities. Bayesian concepts such the prior, likelihood, (joint and marginal) posterior, probability matching and maximizing, and calculating vs. sampling from the posterior are all reviewed and linked to neural network computations. Probabilistic and neural network models are explicitly linked to the concept of a probabilistic generative model that describes the relationship between the underlying target of perception (e.g., the word intended by a speaker or other source of sensory stimuli) and the sensory input that reaches the perceiver for use in inferring the underlying target. It is shown how a new version of the IA model called the multinomial interactive activation (MIA) model can sample correctly from the joint posterior of a proposed generative model for perception of letters in words, indicating that interactive processing is fully consistent with principled probabilistic computation. Ways in which these computations might be realized in real neural systems are also considered.

[1]  D. D. Wheeler Processes in word recognition , 1970 .

[2]  John J. L. Morton,et al.  Interaction of information in word recognition. , 1969 .

[3]  C D Salzman,et al.  Neural mechanisms for forming a perceptual decision. , 1994, Science.

[4]  N. Ambady,et al.  A dynamic interactive theory of person construal. , 2011, Psychological review.

[5]  Mike Wright,et al.  Qualitative Choice Analysis-Theory, Econometrics and an Application to Automobile Demand , 1987 .

[6]  D. Massaro,et al.  Visual information and redundancy in reading. , 1973, Journal of experimental psychology.

[7]  D. Massaro,et al.  The role of lateral masking and orthographic structure in letter and word recognition. , 1979, Acta psychologica.

[8]  Geoffrey E. Hinton,et al.  OPTIMAL PERCEPTUAL INFERENCE , 1983 .

[9]  R. Duncan Luce,et al.  Individual Choice Behavior , 1959 .

[10]  Tai Sing Lee,et al.  Hierarchical Bayesian inference in the visual cortex. , 2003, Journal of the Optical Society of America. A, Optics, image science, and vision.

[11]  James L. McClelland,et al.  An interactive activation model of context effects in letter perception: I. An account of basic findings. , 1981 .

[12]  G. M. Reicher Perceptual recognition as a function of meaninfulness of stimulus material. , 1969, Journal of experimental psychology.

[13]  K. Train Qualitative Choice Analysis: Theory, Econometrics, and an Application to Automobile Demand , 1985 .

[14]  D. Rumelhart,et al.  Process of recognizing tachistoscopically presented words. , 1974, Psychological review.

[15]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[16]  David E. Rumelhart,et al.  Toward an interactive model of reading. , 1994 .

[17]  E. B. Huey The Psychology And Pedagogy Of Reading , 1908 .

[18]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[19]  W S McCulloch,et al.  A logical calculus of the ideas immanent in nervous activity , 1990, The Philosophy of Artificial Intelligence.

[20]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[21]  James L. McClelland,et al.  Visual factors in word perception , 1973 .

[22]  Thomas Dean,et al.  A Computational Model of the Cerebral Cortex , 2005, AAAI.

[23]  J J Hopfield,et al.  Neural networks and physical systems with emergent collective computational abilities. , 1982, Proceedings of the National Academy of Sciences of the United States of America.

[24]  D. Whitteridge,et al.  Learning and Relearning , 1959, Science's STKE.

[25]  James L. McClelland Stochastic interactive processes and the effect of context on perception , 1991, Cognitive Psychology.

[26]  James L. McClelland,et al.  Generalization Through the Recurrent Interaction of Episodic Memories , 2012, Psychological review.

[27]  Lori L. Holt,et al.  Are there interactive processes in speech perception? , 2006, Trends in Cognitive Sciences.

[28]  D. Massaro Testing between the TRACE model and the fuzzy logical model of speech perception , 1989, Cognitive Psychology.

[29]  D. Massaro,et al.  Integration versus interactive activation: The joint influence of stimulus and context in perception , 1991, Cognitive Psychology.

[30]  R. M. Warren,et al.  Phonemic restorations based on subsequent context , 1974 .

[31]  C S Green,et al.  Alterations in choice behavior by manipulations of world model , 2010, Proceedings of the National Academy of Sciences.

[32]  R. Nosofsky American Psychological Association, Inc. Choice, Similarity, and the Context Theory of Classification , 2022 .

[33]  Edward E. Smith,et al.  Expectancy as a determinant of functional units in perceptual recognition , 1971 .

[34]  G. Mandler,et al.  INTERACTION OF TWO SOURCES OF INFORMATION IN TACHISTOSCOPIC WORD RECOGNITION. , 1964, Canadian journal of psychology.

[35]  D Norris,et al.  Merging information in speech recognition: Feedback is never necessary , 2000, Behavioral and Brain Sciences.

[36]  W. Ganong Phonetic categorization in auditory word perception. , 1980, Journal of experimental psychology. Human perception and performance.

[37]  G. A. Miller,et al.  The intelligibility of speech as a function of the context of the test materials. , 1951, Journal of experimental psychology.

[38]  James L. McClelland,et al.  The Morton-Massaro law of information integration: implications for models of perception. , 2001, Psychological review.

[39]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Geoffrey E. Hinton,et al.  Learning and relearning in Boltzmann machines , 1986 .

[41]  J. C. Johnston,et al.  Perception of Letters in Words: Seek Not and Ye Shall Find , 1974, Science.

[42]  James L. McClelland,et al.  Interactive Activation and Mutual Constraint Satisfaction in Perception and Cognition , 2014, Cogn. Sci..

[43]  James L. McClelland,et al.  Cognitive penetration of the mechanisms of perception: Compensation for coarticulation of lexically restored phonemes , 1988 .

[44]  J Grainger,et al.  Orthographic processing in visual word recognition: a multiple read-out model. , 1996, Psychological review.

[45]  James L. McClelland,et al.  An interactive activation model of context effects in letter perception: Part 2. The contextual enhancement effect and some tests and extensions of the model. , 1982, Psychological review.

[46]  D W Massaro,et al.  Letter information and orthographic context in word perception. , 1979, Journal of experimental psychology. Human perception and performance.

[47]  R A Johnston,et al.  Understanding face recognition with an interactive activation model. , 1990, British journal of psychology.

[48]  Judea Pearl,et al.  Reverend Bayes on Inference Engines: A Distributed Hierarchical Approach , 1982, AAAI.

[49]  James L. McClelland,et al.  The TRACE model of speech perception , 1986, Cognitive Psychology.

[50]  P. Derks,et al.  Simple strategies in binary prediction by children and adults. , 1967 .

[51]  D. Norris,et al.  Shortlist B: a Bayesian model of continuous speech recognition. , 2008, Psychological review.

[52]  James L. McClelland,et al.  Matching Exact Posterior Probabilities in the Multinomial Interactive Activation Model , 2010 .