Distributed Representations, Simple Recurrent Networks, and Grammatical Structure

In this paper three problems for a connectionist account of language are considered: 1. What is the nature of linguistic representations? 2. How can complex structural relationships such as constituent structure be represented? 3. How can the apparently open-ended nature of language be accommodated by a fixed-resource system? Using a prediction task, a simple recurrent network (SRN) is trained on multiclausal sentences which contain multiply-embedded relative clauses. Principal component analysis of the hidden unit activation patterns reveals that the network solves the task by developing complex distributed representations which encode the relevant grammatical relations and hierarchical constituent structure. Differences between the SRN state representations and the more traditional pushdown store are discussed in the final section.

[1]  E. Mark Gold,et al.  Language Identification in the Limit , 1967, Inf. Control..

[2]  Walter S. Stolz,et al.  A study of the ability to decode grammatically novel sentences , 1967 .

[3]  R. Langacker,et al.  Meaning and the Structure of Language , 1972 .

[4]  I. M. Schlesinger On Linguistic Competence , 1971 .

[5]  Yehoshua Bar-Hillel,et al.  Pragmatics of natural languages , 1975 .

[6]  J. Fodor The Language of Thought , 1980 .

[7]  Gregg C. Oden,et al.  Semantic constraints and judged preference for interpretations of ambiguous sentences , 1978 .

[8]  S. Thompson,et al.  Transitivity in Grammar and Discourse , 1980 .

[9]  M. Kutas,et al.  Reading senseless sentences: brain potentials reflect semantic incongruity. , 1980, Science.

[10]  W. Cooper,et al.  Sentence Processing: Psycholinguistic Studies Presented to Merrill Garrett. , 1980 .

[11]  F Grosjean,et al.  Spoken word recognition processes and the gating paradigm , 1980, Perception & psychophysics.

[12]  W. Marslen-Wilson,et al.  The temporal structure of spoken language understanding , 1980, Cognition.

[13]  Eric Wanner,et al.  Language acquisition: the state of the art , 1982 .

[14]  T. Givón,et al.  Syntax: A Functional-Typological Introduction , 1982 .

[15]  A. Salasoo,et al.  Interaction of Knowledge Sources in Spoken Word Identification. , 1985, Journal of memory and language.

[16]  Joseph Paul Stemberger,et al.  The lexicon in a model of language production , 1985 .

[17]  Geoffrey E. Hinton,et al.  Symbols Among the Neurons: Details of a Connectionist Inference Architecture , 1985, IJCAI.

[18]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[19]  James L. McClelland,et al.  On learning the past-tenses of English verbs: implicit rules or parallel distributed processing , 1986 .

[20]  G S Dell,et al.  A spreading-activation theory of retrieval in sentence production. , 1986, Psychological review.

[21]  James L. McClelland,et al.  PDP models and general issues in cognitive science , 1986 .

[22]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[23]  W. O'grady,et al.  Functional Syntax: Anaphora, Discourse, and Empathy , 1987 .

[24]  M. Kutas,et al.  Event-related brain potentials (ERPs) elicited during rapid serial visual presentation of congruous and incongruous sentences. , 1987, Electroencephalography and clinical neurophysiology. Supplement.

[25]  M. Coltheart Attention and Performance XII: The Psychology of Reading , 1987 .

[26]  Terrence J. Sejnowski,et al.  Parallel Networks that Learn to Pronounce English Text , 1987, Complex Syst..

[27]  James L. McClelland The Case for Interactionism in Language Processing. , 1987 .

[28]  Ronald W. Langacker,et al.  A usage-based model , 1988 .

[29]  Charles P. Dolan,et al.  Implementing a Connectionist Production System Using Tensor Products ; CU-CS-411-88 , 1988 .

[30]  A. H. Kawamoto Distributed Representations of Ambiguous Words and Their Resolution in a Connectionist Network , 1988 .

[31]  B. Flury Common Principal Components and Related Multivariate Models , 1988 .

[32]  J. Fodor,et al.  Connectionism and cognitive architecture: A critical analysis , 1988, Cognition.

[33]  S. Pinker,et al.  Connections and symbols , 1988 .

[34]  P. Smolensky On the proper treatment of connectionism , 1988, Behavioral and Brain Sciences.

[35]  Michael C. Mozer,et al.  Skeletonization: A Technique for Trimming the Fat from a Network via Relevance Assessment , 1988, NIPS.

[36]  Jerome A. Feldman,et al.  Connectionist Models and Their Properties , 1982, Cogn. Sci..

[37]  Paul Smolensky,et al.  Putting together connectionism – again , 1988, Behavioral and Brain Sciences.

[38]  G. Lakoff,et al.  Women, Fire, and Dangerous Things: What Categories Reveal about the Mind , 1988 .

[39]  Garrison W. Cottrell,et al.  Lexical Ambiguity Resolution: Perspectives from Psycholinguistics, Neuropsychology, and Artificial Intelligence , 1988 .

[40]  R. Taraban,et al.  Language learning: Cues or rules? , 1989 .

[41]  Garrison W. Cottrell,et al.  A Connectionist Perspective on Prosodic Structure , 1989 .

[42]  L. Shastri,et al.  A Connectionist System for Rule Based Reasoning With Multi-Place Predicates and Variables , 1989 .

[43]  David S. Touretzky Rules and Maps in Connectionist Symbol Processing , 1989 .

[44]  S. Pinker Learnability and Cognition: The Acquisition of Argument Structure , 1989 .

[45]  Dennis Sanger,et al.  Contribution analysis: a technique for assigning responsibilities to hidden units in connectionist networks , 1991 .

[46]  James L. McClelland,et al.  Sentence comprehension: A parallel distributed processing approach. Technical report , 1989 .

[47]  R. Miikkulainen,et al.  A modular neural network architecture for sequential paraphrasing of script-based stories , 1989, International 1989 Joint Conference on Neural Networks.

[48]  James L. McClelland,et al.  Sentence comprehension: A parallel distributed processing approach , 1989, Language and Cognitive Processes.

[49]  Deirdre W. Wheeler,et al.  A Connectionist Implementation of Cognitive Phonology , 1989 .

[50]  G. Lakoff Women, fire, and dangerous things : what categories reveal about the mind , 1989 .

[51]  Gerry Altmann,et al.  Cognitive Models of Speech Processing: Psycholinguistic and Computational Perspectives - Workshop Overview , 1989, AI Mag..

[52]  H. White,et al.  Universal approximation using feedforward networks with non-sigmoid hidden layer activation functions , 1989, International 1989 Joint Conference on Neural Networks.

[53]  Geoffrey E. Hinton,et al.  Proceedings of the 1988 Connectionist Models Summer School , 1989 .

[54]  Paul Smolensky,et al.  Tensor Product Variable Binding and the Representation of Symbolic Structures in Connectionist Systems , 1990, Artif. Intell..

[55]  Geoffrey E. Hinton,et al.  Distributed Representations , 1986, The Philosophy of Artificial Intelligence.

[56]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[57]  Jordan B. Pollack,et al.  Recursive Distributed Representations , 1990, Artif. Intell..

[58]  Mary Hare,et al.  The Role of Similarity in Hungarian Vowel Harmony: a Connectionist Account , 1990 .

[59]  David J. Chalmers,et al.  Syntactic Transformations on Distributed Representations , 1990 .

[60]  J. Elman Representation and structure in connectionist models , 1991 .

[61]  Michael K. Tanenhaus,et al.  Combinatory lexical information and language comprehension , 1991 .