论文信息 - Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network

Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network

We present a new part-of-speech tagger that demonstrates the following ideas: (i) explicit use of both preceding and following tag contexts via a dependency network representation, (ii) broad use of lexical features, including jointly conditioning on multiple consecutive words, (iii) effective use of priors in conditional loglinear models, and (iv) fine-grained modeling of unknown word features. Using these ideas together, the resulting tagger gives a 97.24% accuracy on the Penn Treebank WSJ, an error reduction of 4.4% on the best previous single automatically learned tagging result.

[1] Kenneth Ward Church. A Stochastic Parts Program and Noun Phrase Parser for Unrestricted Text , 1989, ANLP.

[2] Beatrice Santorini,et al. Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[3] Eugene Charniak,et al. Equations for Part-of-Speech Tagging , 1993, AAAI.

[4] Eric Brill,et al. Transformation-Based Error-Driven Learning and Natural Language Processing: A Case Study in Part-of-Speech Tagging , 1995, CL.

[5] Adwait Ratnaparkhi,et al. A Maximum Entropy Model for Part-Of-Speech Tagging , 1996, EMNLP.

[6] Eric Brill,et al. Classifier Combination for Improved Lexical Disambiguation , 1998, ACL.

[7] Mark Johnson,et al. Estimators for Stochastic “Unification-Based” Grammars , 1999, ACL.

[8] Yoram Singer,et al. Boosting Applied to Tagging and PP Attachment , 1999, EMNLP.

[9] David J. Spiegelhalter,et al. Probabilistic Networks and Expert Systems , 1999, Information Science and Statistics.

[10] Mary P. Harper,et al. A Second-Order Hidden Markov Model for Part-of-Speech Tagging , 1999, ACL.

[11] Christopher D. Manning,et al. Enriching the Knowledge Sources Used in a Maximum Entropy Part-of-Speech Tagger , 2000, EMNLP.

[12] David Maxwell Chickering,et al. Dependency Networks for Inference, Collaborative Filtering, and Data Visualization , 2000, J. Mach. Learn. Res..

[13] Hae-Chang Rim,et al. Part-of-Speech Tagging Based on Hidden Markov Model Assuming Joint Independence , 2000, ACL.

[14] Thorsten Brants,et al. TnT – A Statistical Part-of-Speech Tagger , 2000, ANLP.

[15] Andrew McCallum,et al. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[16] Dan Klein,et al. Conditional Structure versus Conditional Estimation in NLP Models , 2002, EMNLP.

[17] Michael Collins,et al. Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms , 2002, EMNLP.

[18] Andrew W. Moore,et al. Fast Robust Logistic Regression for Large Sparse Datasets with Binary Outputs , 2003, AISTATS.

[19] Tong Zhang,et al. Text Categorization Based on Regularized Linear Classification Methods , 2001, Information Retrieval.