Statistical language learning

From the Publisher: Eugene Charniak breaks new ground in artificial intelligence research by presenting statistical language processing from an artificial intelligence point of view in a text for researchers and scientists with a traditional computer science background. New, exacting empirical methods are needed to break the deadlock in such areas of artificial intelligence as robotics, knowledge representation, machine learning, machine translation, and natural language processing (NLP). It is time, Charniak observes, to switch paradigms. This text introduces statistical language processing techniques -- word tagging, parsing with probabilistic context free grammars, grammar induction, syntactic disambiguation, semantic word classes, word-sense disambiguation -- along with the underlying mathematics and chapter exercises. Charniak points out that as a method of attacking NLP problems, the statistical approach has several advantages. It is grounded in real text and therefore promises to produce usable results, and it offers an obvious way to approach learning: "one simply gathers statistics." Language, Speech, and Communication

[1]  L. Goddard Information Theory , 1962, Nature.

[2]  P. Bickel,et al.  Mathematical Statistics: Basic Ideas and Selected Topics , 1977 .

[3]  Lalit R. Bahl,et al.  A Maximum Likelihood Approach to Continuous Speech Recognition , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Kenneth Ward Church,et al.  Parsing, Word Associations and Typical Predicate-Argument Relations , 1989, HLT.

[5]  Eric Brill,et al.  Deducing Linguistic Structure from the Statistics of Large Corpora , 1990, HLT.

[6]  Kenneth Ward Church,et al.  Poor Estimates of Context are Worse than None , 1990, HLT.

[7]  John Cocke,et al.  A Statistical Approach to Machine Translation , 1990, CL.

[8]  Ralph Grishman,et al.  Statistical Parsing of Messages , 1990, HLT.

[9]  Mitchell P. Marcus,et al.  Parsing the Voyager Domain Using Pearl , 1991, HLT.

[10]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[11]  Mats Rooth,et al.  Structural Ambiguity and Lexical Relations , 1991, ACL.

[12]  Robert L. Mercer,et al.  Class-Based n-gram Models of Natural Language , 1992, CL.

[13]  John D. Lafferty,et al.  Decision Tree Models Applied to the Labeling of Text with Parts-of-Speech , 1992, HLT.

[14]  Stephanie Seneff,et al.  TINA: A Natural Language System for Spoken Language Applications , 1992, Comput. Linguistics.

[15]  John D. Lafferty,et al.  Towards History-based Grammars: Using Richer Models for Probabilistic Parsing , 1993, ACL.

[16]  Eric Brill,et al.  A Simple Rule-Based Part of Speech Tagger , 1992, HLT.

[17]  Ronald Rosenfeld,et al.  Adaptive Language Modeling Using the Maximum Entropy Principle , 1993, HLT.

[18]  Eric Brill,et al.  A corpus-based approach to language learning , 1993 .

[19]  Rens Bod Using an Annotated Corpus as a Stochastic Grammar , 1993, EACL.

[20]  P. Resnik Selection and information: a class-based approach to lexical relationships , 1993 .

[21]  Bernard Mérialdo,et al.  Tagging English Text with a Probabilistic Model , 1994, CL.

[22]  David Yarowsky,et al.  DECISION LISTS FOR LEXICAL AMBIGUITY RESOLUTION: Application to Accent Restoration in Spanish and French , 1994, ACL.

[23]  Richard M. Schwartz,et al.  Hidden Understanding Models of Natural Language , 1994, ACL.

[24]  John D. Lafferty,et al.  Inference and Estimation of a Long-Range Trigram Model , 1994, ICGI.

[25]  David M. Magerman Natural Language Parsing as Statistical Pattern Recognition , 1994, ArXiv.