Distributional Information: A Powerful Cue for Acquiring Syntactic Categories

Many theorists have dismissed a priori the idea that distributional information could play a significant role in syntactic category acquisition. We demonstrate empirically that such information provides a powerful cue to syntactic category membership, which can be exploited by a variety of simple, psychologically plausible mechanisms. We present a range of results using a large corpus of child-directed speech and explore their psychological implications. While our results show that a considerable amount of information concerning the syntactic categories can be obtained from distributional information alone, we stress that many other sources of information may also be potential contributors to the identification of syntactic classes.

[1]  James L. McClelland,et al.  Finite State Automata and Simple Recurrent Networks , 1989, Neural Computation.

[2]  Jeffrey L. Elman,et al.  Distributed Representations, Simple Recurrent Networks, and Grammatical Structure , 1991, Mach. Learn..

[3]  Nick Chater,et al.  A statistical analysis of an idealised phonological transcription of the London-Lund corpus , 1995 .

[4]  Melissa Bowerman,et al.  STRUCTURAL RELATIONSHIPS IN CHILDREN'S UTTERANCES: SYNTACTIC OR SEMANTIC?1 , 1973 .

[5]  Nick Chater,et al.  Distributional Bootstrapping: From Word Class to Proto-Sentence , 2019, Proceedings of the Sixteenth Annual Conference of the Cognitive Science Society.

[6]  P. Jusczyk,et al.  Clauses are perceptual units for young infants , 1987, Cognition.

[7]  Charles Carpenter Fries,et al.  The Structure of English , 1954 .

[8]  Nick Chater,et al.  Neural networks: The new statistical models of mind , 1995 .

[9]  P. Jusczyk,et al.  Infants′ Detection of the Sound Patterns of Words in Fluent Speech , 1995, Cognitive Psychology.

[10]  G. Zipf,et al.  The Psycho-Biology of Language , 1936 .

[11]  H. Gleitman,et al.  Mother, Id rather do it myself: Some effects and non-effects of maternal speech style , 1977 .

[12]  P. Jusczyk,et al.  Infants' preference for the predominant stress patterns of English words. , 1993, Child development.

[13]  T. A. Cartwright,et al.  Distributional regularity and phonotactic constraints are useful for segmentation , 1996, Cognition.

[14]  R. Horton Rules and representations , 1993, The Lancet.

[15]  G. Āllport The Psycho-Biology of Language. , 1936 .

[16]  F. Moore Cognitive development and the acquisition of language , 1973 .

[17]  N. Chater,et al.  Probabilistic and distributional approaches to language acquisition , 1997, Trends in Cognitive Sciences.

[18]  D. Slobin,et al.  Studies of child language development , 1973 .

[19]  Gordon D. A. Brown,et al.  Neurodynamics and psychology , 1994 .

[20]  M. H. Kelly,et al.  Stress in time. , 1988, Journal of experimental psychology. Human perception and performance.

[21]  Eric Brill,et al.  Deducing Linguistic Structure from the Statistics of Large Corpora , 1990, HLT.

[22]  F. Newmeyer Linguistics: The Cambridge Survey , 1989 .

[23]  T. A. Cartwright,et al.  Early acquisition of syntactic categories: A formal model , 1997 .

[24]  Noam Chomsky,et al.  A Review of B. F. Skinner's Verbal Behavior , 1980 .

[25]  Keith E. Nelson,et al.  Facilitating children's syntax acquisition. , 1977 .

[26]  M. H. Kelly,et al.  Using sound to solve syntactic problems: the role of phonology in grammatical category assignments. , 1992, Psychological review.

[27]  J. Elman Learning and development in neural networks: the importance of starting small , 1993, Cognition.

[28]  Steven Finch,et al.  Finding structure in language , 1995 .

[29]  Nick Chater,et al.  BOOTSTRAPPING SYNTACTIC CATEGORIES , 1992 .

[30]  P. Broen,et al.  The Verbal Environment of the Language-Learning Child. ASHA Monographs, No. 17. , 1972 .

[31]  Nick Chater,et al.  Bottom-up connectionist modelling of speech , 1995 .

[32]  Naftali Tishby,et al.  Algebraic learning of statistical associations for language acquisition , 1994, Comput. Speech Lang..

[33]  Zellig S. Harris,et al.  From Phoneme to Morpheme , 1955 .

[34]  Nick Chater,et al.  DISTRIBUTIONAL INFORMATION AND THE ACQUISITION OF LINGUISTIC CATEGORIES - A STATISTICAL APPROACH , 1993 .

[35]  J. Bruner The ontogenesis of speech acts , 1975, Journal of Child Language.

[36]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[37]  Elissa L. Newport,et al.  The role of constituent structure in the induction of an artificial language , 1981 .

[38]  L. Gleitman The Structural Sources of Verb Meanings , 2020, Sentence First, Arguments Afterward.

[39]  Nick Chater,et al.  Connectionist and Statistical Approaches to Language Acquisition: A Distributional Perspective. , 1998 .

[40]  Julian M. Kupiec,et al.  Robust part-of-speech tagging using a hidden Markov model , 1992 .

[41]  H. Ross Principles of Numerical Taxonomy , 1964 .

[42]  Nick Chater,et al.  Toward a connectionist model of recursion in human linguistic performance , 1999, Cogn. Sci..

[43]  Nick Chater,et al.  A hybrid approach to the automatic learning of linguistic categories , 1991 .

[44]  George R. Kiss Grammatical Word Classes: A Learning Process and its Simulation , 1973 .

[45]  Noam Chomsky Review of B.F. Skinner, Verbal Behavior , 1959 .

[46]  C Snow,et al.  Child language data exchange system , 1984, Journal of Child Language.

[47]  Azriel Rosenfeld,et al.  An application of cluster detection to text and picture processing , 1969, IEEE Trans. Inf. Theory.

[48]  Curt Burgess,et al.  Producing high-dimensional semantic spaces from lexical co-occurrence , 1996 .

[49]  Second Edition,et al.  Statistical Package for the Social Sciences , 1970 .

[50]  C. L. Baker,et al.  The Logical problem of language acquisition , 1984 .

[51]  M. H. Kelly,et al.  Phonological information for grammatical category assignments , 1991 .

[52]  T. P. Hettmansperger,et al.  Statistical Inference Based on Ranks. , 1985 .

[53]  W. Cooper,et al.  Speech timing of grammatical categories , 1978, Cognition.

[54]  C. Snow Mothers' Speech to Children Learning Language. , 1972 .

[55]  J. G. Wolff FREQUENCY, CONCEPTUAL STRUCTURE AND PATTERN RECOGNITION , 1976 .

[56]  Noam Chomsky Current Issues in Linguistic Theory , 1964 .

[57]  Kenneth Ward Church A Stochastic Parts Program and Noun Phrase Parser for Unrestricted Text , 1989, ANLP.

[58]  Nick Chater,et al.  A phonologically motivated input representation for the modelling of auditory word perception in continuous speech , 1992 .

[59]  Morten H. Christiansen,et al.  Learning to Segment Speech Using Multiple Cues: A Connectionist Model , 1998 .

[60]  C. A. Ferguson,et al.  Talking to Children: Language Input and Acquisition , 1979 .

[61]  P. Bloom,et al.  Language Acquisition: Core Readings , 1993 .

[62]  Roger K. Moore Computer Speech and Language , 1986 .

[63]  Eric Brill,et al.  Discovering the Lexical Features of a Language , 1991, ACL.

[64]  Y. Levy It's frogs all the way down , 1983, Cognition.

[65]  A. Fernald Human maternal vocalizations to infants as biologically relevant signals: An evolutionary perspective. , 1992 .

[66]  Norman H. Nie,et al.  SPSS, statistical package for the social sciences, 2nd Edition , 1975 .

[67]  Steven Pinker,et al.  Language learnability and language development , 1985 .

[68]  Nick Chater,et al.  FINDING LINGUISTIC STRUCTURE WITH RECURRENT NEURAL NETWORKS , 1992 .

[69]  Eugene Charniak,et al.  Statistical language learning , 1997 .

[70]  John R. Taylor,et al.  语言的范畴化:语言学理论中的类典型 = Linguistic categorization : prototypes in linguistic theory , 1989 .

[71]  J. Pind The Discovery of Spoken Language, Peter W. Jusczyk (Ed.). MIT Press (1997), ISBN 0 262 10058 4 , 1997 .

[72]  Hinrich Schütze,et al.  Word Space , 1992, NIPS.

[73]  Max-Planck-Gesellschaft zur Förderung der Wissenschaften,et al.  The Child's construction of language , 1981 .

[74]  R. Quirk,et al.  A Corpus of English Conversation , 1980 .

[75]  A. Prince,et al.  On stress and linguistic rhythm , 1977 .

[76]  Nick Chater,et al.  Toward a connectionist model of recursion in human linguistic performance , 1999 .

[77]  J. Wolff The discovery of segments in natural language , 1977 .

[78]  I. M. Schlesinger,et al.  Categories and Processes in Language Acquisition , 1990 .

[79]  Andrew Radford,et al.  Transformational Grammar: Contents , 1988 .

[80]  Zellig S. Harris,et al.  Distributional Structure , 1954 .

[81]  M. Maratsos,et al.  The internal language of children's syntax : The ontogenesis and representation of syntactic categories , 1980 .

[82]  W. Lambert,et al.  A psychological investigation of French speakers' skill with grammatical gender , 1968 .

[83]  H. Gleitman,et al.  Linguistics: The Cambridge Survey: Where learning begins: initial representations for language learning , 1988 .

[84]  Helge Ritter,et al.  Learning ″Semantotopic Maps″ from Context , 1990 .

[85]  J. Reznick,et al.  Developmental and stylistic variation in the composition of early vocabulary , 1994, Journal of Child Language.

[86]  Noam Chomsky,et al.  वाक्यविन्यास का सैद्धान्तिक पक्ष = Aspects of the theory of syntax , 1965 .

[87]  Jeffrey L. Elman,et al.  A PDP Approach to Processing Center-Embedded Sentences , 1992 .

[88]  Catherine E. Snow,et al.  Language acquisition through language use: The functional sources of children's early utterances. , 1988 .

[89]  NetworksMorten H. Christiansen,et al.  Natural Language Recursion and Recurrent Neural , 1994 .

[90]  T. A. Cartwright,et al.  Syntactic categorization in early language acquisition: formalizing the role of distributional analysis , 1997, Cognition.