Optimization and Quantization in Gradient Symbol Systems: A Framework for Integrating the Continuous and the Discrete in Cognition

Mental representations have continuous as well as discrete, combinatorial properties. For example, while predominantly discrete, phonological representations also vary continuously; this is reflected by gradient effects in instrumental studies of speech production. Can an integrated theoretical framework address both aspects of structure? The framework we introduce here, Gradient Symbol Processing, characterizes the emergence of grammatical macrostructure from the Parallel Distributed Processing microstructure (McClelland, Rumelhart, & The PDP Research Group, 1986) of language processing. The mental representations that emerge, Distributed Symbol Systems, have both combinatorial and gradient structure. They are processed through Subsymbolic Optimization-Quantization, in which an optimization process favoring representations that satisfy well-formedness constraints operates in parallel with a distributed quantization process favoring discrete symbolic structures. We apply a particular instantiation of this framework, λ-Diffusion Theory, to phonological production. Simulations of the resulting model suggest that Gradient Symbol Processing offers a way to unify accounts of grammatical competence with both discrete and continuous patterns in language performance.

[1]  Jeffrey S Bowers,et al.  Challenging the widespread assumption that connectionism and distributed representations go hand-in-hand , 2002, Cognitive Psychology.

[2]  James L. McClelland Toward a theory of information processing in graded, random, and interactive networks , 1993 .

[3]  Marianne Pouplier,et al.  Intention in articulation: Articulatory timing in alternating consonant sequences and its implications for models of speech production , 2010, Language and cognitive processes.

[4]  H. Bird Slips of the ear as evidence for the postperceptual priority of grammaticality , 1998 .

[5]  S. Pinker,et al.  On language and connectionism: Analysis of a parallel distributed processing model of language acquisition , 1988, Cognition.

[6]  James L. McClelland,et al.  An interactive activation model of context effects in letter perception: I. An account of basic findings. , 1981 .

[7]  M. Goldrick,et al.  Does like attract like? Exploring the relationship between errors and representational structure in connectionist networks , 2008, Cognitive neuropsychology.

[8]  Tony Plate,et al.  Holographic Reduced Representations: Convolution Algebra for Compositional Distributed Representations , 1991, IJCAI.

[9]  R. Golden The :20Brain-state-in-a-box Neural model is a gradient descent algorithm , 1986 .

[10]  Paul Smolensky,et al.  Tensor Product Variable Binding and the Representation of Symbolic Structures in Connectionist Systems , 1990, Artif. Intell..

[11]  James L. McClelland,et al.  Understanding normal and impaired word reading: computational principles in quasi-regular domains. , 1996, Psychological review.

[12]  Daniel Recasens,et al.  Dispersion and variability in Catalan five and six peripheral vowel systems , 2009, Speech Commun..

[13]  James L. McClelland,et al.  An Introduction to Linear Algebra in Parallel Distributed Processing , 1987 .

[14]  Manuel Perea,et al.  Transposed-Letter Confusability Effects in Masked Form Priming , 2003 .

[15]  Kenneth N. Stevens,et al.  Quantal theory, enhancement and overlap , 2010, J. Phonetics.

[16]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[17]  Tony A. Plate,et al.  Holographic Reduced Representation: Distributed Representation for Cognitive Structures , 2003 .

[18]  G. E. Peterson,et al.  Duration of Syllable Nuclei in English , 1960 .

[19]  Paul Smolensky Neural and Conceptual Interpretations of Parallel Distributed Processing Models , 1986 .

[20]  Jerome A. Feldman,et al.  Connectionist Models and Their Properties , 1982, Cogn. Sci..

[21]  Joe Pater,et al.  Weighted Constraints in Generative Linguistics , 2009, Cogn. Sci..

[22]  R. Golden A unified framework for connectionist systems , 1988, Biological Cybernetics.

[23]  Simon D. Levy,et al.  A DISTRIBUTED BASIS FOR ANALOGICAL MAPPING , 2009 .

[24]  James L. McClelland,et al.  An interactive activation model of context effects in letter perception: Part 2. The contextual enhancement effect and some tests and extensions of the model. , 1982, Psychological review.

[25]  Matthew Goldrick,et al.  Interaction and representational integration: Evidence from speech errors , 2011, Cognition.

[26]  Adamantios I. Gafos,et al.  Dynamics of Phonological Cognition , 2006, Cogn. Sci..

[27]  Willem J. M. Levelt,et al.  A theory of lexical access in speech production , 1999, Behavioral and Brain Sciences.

[28]  John J. Hopfield,et al.  Neural networks and physical systems with emergent collective computational abilities , 1999 .

[29]  Terrence J. Sejnowski,et al.  The Computational Brain , 1996, Artif. Intell..

[30]  Whitney Tabor,et al.  Fractal encoding of context‐free grammars in connectionist networks , 2000, Expert Syst. J. Knowl. Eng..

[31]  Zenon W. Pylyshyn,et al.  Computation and Cognition: Toward a Foundation for Cognitive Science , 1984 .

[32]  B. Lindblom,et al.  Numerical Simulation of Vowel Quality Systems: The Role of Perceptual Contrast , 1972 .

[33]  K. Forster,et al.  REPETITION PRIMING AND FREQUENCY ATTENUATION IN LEXICAL ACCESS , 1984 .

[34]  Geoffrey E. Hinton,et al.  A general framework for parallel distributed processing , 1986 .

[35]  Frank H. Eeckman,et al.  A normal form projection algorithm for associative memory , 1993 .

[36]  D. Rumelhart NOTES ON A SCHEMA FOR STORIES , 1975 .

[37]  Carolyn E. Wilshire,et al.  The “Tongue Twister” Paradigm as a Technique for Studying Phonological Encoding , 1999 .

[38]  S. Lupker,et al.  Masked inhibitory priming in english: evidence for lexical inhibition. , 2006, Journal of experimental psychology. Human perception and performance.

[39]  Roman Jakobson,et al.  Selected Writings, I: Phonological Studies , 1967 .

[40]  Steven Pinker,et al.  Wickelphone ambiguity , 1988, Cognition.

[41]  Tony A. Plate,et al.  Analogy retrieval and processing with distributed vector representations , 2000, Expert Syst. J. Knowl. Eng..

[42]  C. Lebiere,et al.  The Atomic Components of Thought , 1998 .

[43]  G. Marcus The Algebraic Mind: Integrating Connectionism and Cognitive Science , 2001 .

[44]  Donald A. Norman,et al.  Representation in Memory. , 1983 .

[45]  James L. McClelland,et al.  Graded Constraints on English Word Forms , 2006 .

[46]  Gordon D. A. Brown,et al.  Serial Control of Phonology in Speech Production: A Hierarchical Model , 2000, Cognitive Psychology.

[47]  M. Page,et al.  Connectionist modelling in psychology: A localist manifesto , 2000, Behavioral and Brain Sciences.

[48]  P. Smolensky,et al.  Optimality Theory: Constraint Interaction in Generative Grammar , 2004 .

[49]  J. Fodor,et al.  Connectionism and cognitive architecture: A critical analysis , 1988, Cognition.

[50]  Stefan Wermter,et al.  A Novel Modular Neural Architecture for Rule-Based and Similarity-Based Reasoning , 1998, Hybrid Neural Systems.

[51]  R. Jakobson,et al.  Selected Writings: I. Phonological Studies , 1965 .

[52]  Emmanuel Dupoux,et al.  Holographic String Encoding , 2011, Cogn. Sci..

[53]  Robert Daland,et al.  Linking speech errors and phonological grammars: insights from Harmonic Grammar networks* , 2009, Phonology.

[54]  Paul Smolensky,et al.  Schema Selection and Stochastic Inference in Modular Environments , 1983, AAAI.

[55]  Ray Pike,et al.  Comparison of convolution and matrix distributed memory systems for associative recall and recognition , 1984 .

[56]  Donald Geman,et al.  Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images , 1984 .

[57]  Joan L. Bybee,et al.  Gradience of Gradience: A reply to Jackendoff , 2007 .

[58]  M. F. Garrett,et al.  The Analysis of Sentence Production1 , 1975 .

[59]  Manuel Perea,et al.  The overlap model: a model of letter position coding. , 2008, Psychological review.

[60]  S. Phillips,et al.  Processing capacity defined by relational complexity: implications for comparative, developmental, and cognitive psychology. , 1998, The Behavioral and brain sciences.

[61]  Marvin Minsky,et al.  A framework for representing knowledge , 1974 .

[62]  K. Holyoak,et al.  A symbolic-connectionist theory of relational inference and generalization. , 2003, Psychological review.

[63]  Allard Jongman,et al.  Incomplete neutralization and other sub-phonemic durational differences in production and perception: evidence from Dutch , 2004, J. Phonetics.

[64]  James L. McClelland,et al.  PDP models and general issues in cognitive science , 1986 .

[65]  Y. Miyata,et al.  Harmonic grammar: A formal multi-level connectionist theory of linguistic well-formedness: Theoretic , 1990 .

[66]  D. Hofstadter,et al.  Godel, Escher, Bach: An Eternal Golden Braid , 1979 .

[67]  Allen Newell,et al.  Physical Symbol Systems , 1980, Cogn. Sci..

[68]  James L. McClelland,et al.  A single-system account of semantic and lexical deficits in five semantic dementia patients , 2008, Cognitive neuropsychology.

[69]  M. Tanenhaus,et al.  Within-category VOT affects recovery from "lexical" garden paths: Evidence against phoneme-level inhibition. , 2009, Journal of memory and language.

[70]  Ross W. Gayler Vector Symbolic Architectures answer Jackendoff's challenges for cognitive neuroscience , 2004, ArXiv.

[71]  Simon D. Levy,et al.  Vector Symbolic Architectures: A New Building Material for Artificial General Intelligence , 2008, AGI.

[72]  A. M. Ramer Mathematical Methods in Linguistics , 1992 .

[73]  Geoffrey E. Hinton,et al.  Distributed Representations , 1986, The Philosophy of Artificial Intelligence.

[74]  B. Murdock A Theory for the Storage and Retrieval of Item and Associative Information. , 1982 .

[75]  D. Pisoni,et al.  Recognizing Spoken Words: The Neighborhood Activation Model , 1998, Ear and hearing.

[76]  D. Hofstadter,et al.  Godel, Escher, Bach: An Eternal Golden Braid , 1979 .

[77]  Michael D. Lee,et al.  A Hierarchical Bayesian Model of Human Decision-Making on an Optimal Stopping Problem , 2006, Cogn. Sci..

[78]  Jerome A. Feldman,et al.  Neural Representation of Conceptual Knowledge. , 1986 .

[79]  Javier R. Movellan,et al.  Learning Continuous Probability Distributions with Symmetric Diffusion Networks , 1993, Cogn. Sci..

[80]  R. Port,et al.  Against Formal Phonology , 2005 .

[81]  Geoffrey E. Hinton,et al.  Parallel Models of Associative Memory , 1989 .

[82]  Jordan B. Pollack,et al.  Recursive Distributed Representations , 1990, Artif. Intell..

[83]  John K. Kruschke,et al.  The Cambridge Handbook of Computational Psychology: Models of Categorization , 2008 .

[84]  James L. McClelland,et al.  On learning the past-tenses of English verbs: implicit rules or parallel distributed processing , 1986 .

[85]  Pentti Kanerva,et al.  Hyperdimensional Computing: An Introduction to Computing in Distributed Representation with High-Dimensional Random Vectors , 2009, Cognitive Computation.

[86]  J J Hopfield,et al.  Neurons with graded response have collective computational properties like those of two-state neurons. , 1984, Proceedings of the National Academy of Sciences of the United States of America.

[87]  James L. McClelland,et al.  Alternatives to the combinatorial paradigm of linguistic theory based on domain general principles of human cognition , 2005 .

[88]  James L. McClelland,et al.  An interactive activation model of context effects in letter perception: part 1.: an account of basic findings , 1988 .

[89]  Javier R. Movellan,et al.  A Learning Theorem for Networks at Detailed Stochastic Equilibrium , 1998, Neural Computation.

[90]  Geoffrey E. Hinton,et al.  OPTIMAL PERCEPTUAL INFERENCE , 1983 .

[91]  James L. McClelland,et al.  The TRACE model of speech perception , 1986, Cognitive Psychology.

[92]  James L. McClelland,et al.  Mechanisms of Sentence Processing: Assigning Roles to Constituents of Sentences , 1986 .

[94]  James L. McClelland,et al.  Parallel Distributed Processing: Explorations in the Microstructure of Cognition : Psychological and Biological Models , 1986 .

[95]  Janet B. Pierrehumbert,et al.  The next toolkit , 2006, J. Phonetics.

[96]  Lisa Davidson,et al.  Phonotactics and Articulatory Coordination Interact in Phonology: Evidence from Nonnative Production , 2006, Cogn. Sci..

[97]  Geoffrey E. Hinton,et al.  Learning and relearning in Boltzmann machines , 1986 .

[98]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[99]  S. Blumstein,et al.  Cascading activation from phonological planning to articulatory processes: Evidence from tongue twisters , 2006 .

[100]  D. Hofstadter Metamagical Themas: Questing for the Essence of Mind and Pattern , 1985 .

[101]  B. Rapp,et al.  Lexical and post-lexical phonological representations in spoken production , 2007, Cognition.

[102]  Terence D Sanger,et al.  Neural population codes , 2003, Current Opinion in Neurobiology.

[103]  R. Port,et al.  Neutralization of syllable-final voicing in German , 1985 .

[104]  Stefanie Shattuck-Hufnagel,et al.  The Limited Use of Distinctive Features and Markedness in Speech Production: Evidence from Speech Error Data. , 1979 .

[105]  G S Dell,et al.  A spreading-activation theory of retrieval in sentence production. , 1986, Psychological review.

[106]  Edward Flemming Scalar and categorical phenomena in a unified model of phonetics and phonology , 2001, Phonology.

[107]  A. Jongman,et al.  Acoustic and perceptual evidence for complete neutralization of manner of articulation in Korean , 1996 .

[108]  G. Kane Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol 1: Foundations, vol 2: Psychological and Biological Models , 1994 .

[109]  Joe Pater The harmonic mind : from neural computation to optimality-theoretic grammar , 2009 .

[110]  C. Browman,et al.  Articulatory Phonology: An Overview , 1992, Phonetica.

[111]  Géraldine Legendre,et al.  The harmonic mind: From neural computation to optimality-theoretic grammar (Vol. 2: Linguistic and philosophical implications). , 2006 .

[112]  Jeffrey S Bowers,et al.  On the biological plausibility of grandmother cells: implications for neural network theories in psychology and neuroscience. , 2009, Psychological review.

[113]  H B Barlow,et al.  Single units and sensation: a neuron doctrine for perceptual psychology? , 1972, Perception.

[114]  Ross W. Gayler,et al.  “ Lateral Inhibition ” in a Fully Distributed Connectionist Architecture , 2009 .