Computational Linguistics and Intelligent Text Processing

We present an LFG syntax-semantics interface for the semiautomatic annotation of frame semantic roles for German in the SALSA project. The architecture is intended to support a bootstrapping cycle for the acquisition of stochastic models for frame semantic role assignment, starting from manual annotations on the basis of the syntactically annotated TIGER treebank, with smooth transition to automatic syntactic analysis and (semi-)automatic semantic annotation of a much larger corpus, on top of a free-running LFG grammar of German. Our study investigates the applicability of the LFG formalism for modeling frame semantic role annotation, and designs a flexible and extensible syntax-semantics architecture that supports the induction of stochastic models for automatic frame assignment. We propose a method familiar from example-based Machine Translation to translate between the TIGER and LFG annotation formats, thus enabling the transition from treebank annotation to large-scale corpus processing.

[1]  Igor A. Bolshakov Multifunction Thesaurus For Russian Word Processing , 1994, ANLP.

[2]  Frank van Harmelen How the semantic web will change KR: challenges and opportunities for a new research agenda , 2002, Knowl. Eng. Rev..

[3]  David Yarowsky,et al.  Distinguishing systems and distinguishing senses: new evaluation methods for Word Sense Disambiguation , 1999, Natural Language Engineering.

[4]  Tomek Strzalkowski,et al.  Natural Language Information Retrieval: TREC-8 Report , 1994, TREC.

[5]  Alexander F. Gelbukh,et al.  Heuristics-Based Replenishment of Collocation Databases , 2002, PorTAL.

[6]  Hiram Calvo,et al.  Improving Disambiguation of Prepositional Phrase Attachments Using the Web as Corpus , 2003 .

[7]  Alexander F. Gelbukh,et al.  Automatic Syntactic Analysis for Detection of Word Combinations , 2004, CICLing.

[8]  Nancy Ide,et al.  © 1999 Kluwer Academic Publishers. Printed in the Netherlands Cross-lingual Sense Determination: Can It Work? , 2022 .

[9]  Alexander F. Gelbukh,et al.  Tool for Computer-Aided Spanish Word Sense Disambiguation , 2003, CICLing.

[10]  David Yarowsky,et al.  Inducing Multilingual Text Analysis Tools via Robust Projection across Aligned Corpora , 2001, HLT.

[11]  Tomek Strzalkowski,et al.  Towards the Next Generation Information Retrieval , 2000, RIAO.

[12]  Ah-Hwee Tan,et al.  Text Mining: The state of the art and the challenges , 2000 .

[13]  Shalom Lappin,et al.  An Algorithm for Pronominal Anaphora Resolution , 1994, CL.

[14]  Ruslan Mitkov,et al.  1 Anaphora Resolution : Where Do We Stand Now ? , 2000 .

[15]  Timothy Baldwin,et al.  Multiword Expressions: A Pain in the Neck for NLP , 2002, CICLing.

[16]  Igor Mel’čuk,et al.  Dependency Syntax: Theory and Practice , 1987 .

[17]  Nancy Ide,et al.  Automatic Sense Tagging Using Parallel Corpora , 2001, NLPRS.

[18]  Robert L. Mercer,et al.  A Statistical Approach to Sense Disambiguation in Machine Translation , 1991, HLT.

[19]  Alon Itai,et al.  Word Sense Disambiguation Using a Second Language Monolingual Corpus , 1994, Comput. Linguistics.

[20]  David Yarowsky,et al.  Minimally Supervised Morphological Analysis by Multimodal Alignment , 2000, ACL.

[21]  Alexander F. Gelbukh,et al.  Text Segmentation into Paragraphs Based on Local Text Cohesion , 2001, TSD.

[22]  Alexander F. Gelbukh,et al.  Dictionary-Based Method for Coherence Maintenance in Man-Machine Dialogue with Indirect Antecedents and Ellipses , 2000, TSD.

[23]  David Yarowsky,et al.  Inducing Multilingual POS Taggers and NP Bracketers via Robust Projection Across Aligned Corpora , 2001, NAACL.

[24]  Branimir Boguraev,et al.  Anaphora for Everyone: Pronominal Anaphora Resolution without a Parser , 1996, COLING.

[25]  Jeremy J. Carroll,et al.  Automatic Learning for Semantic Collocation , 1992, ANLP.

[26]  Ruslan Mitkov,et al.  Robust Pronoun Resolution with Limited Knowledge , 1998, ACL.

[27]  Philip Resnik,et al.  An Unsupervised Method for Word Sense Tagging using Parallel Corpora , 2002, ACL.

[28]  Philip Resnik,et al.  Semantic Similarity in a Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language , 1999, J. Artif. Intell. Res..

[29]  Nancy Ide,et al.  Sense Discrimination with Parallel Corpora , 2002, SENSEVAL.

[30]  Alexander F. Gelbukh,et al.  A Very Large Database of Collocations and Semantic Links , 2000, NLDB.

[31]  Alexander F. Gelbukh,et al.  On Detection of Malapropisms by Multistage Collocation Testing , 2003, NLDB.

[32]  Kevin Knight,et al.  A Syntax-based Statistical Translation Model , 2001, ACL.

[33]  Ido Dagan Lexical Disambiguation: Sources of Information and their Statistical Realization , 1991, ACL.

[34]  Alexander Gelbukh,et al.  Word Sense Disambiguation in a Spanish Explanatory Dictionary , 2001, JEPTALNRECITAL.

[35]  Alexander F. Gelbukh,et al.  Stable Coordinated Pairs in Text Processing , 2003, TSD.

[36]  Alexander F. Gelbukh,et al.  Words Combinations as an Important Part of Modern Electronic Dictionaries , 2002, Proces. del Leng. Natural.

[37]  Ellen M. Voorhees,et al.  Evaluating the Evaluation: A Case Study Using the TREC 2002 Question Answering Track , 2003, NAACL.

[38]  Frank Smadja,et al.  Retrieving Collocations from Text: Xtract , 1993, CL.

[39]  Roland Stuckardt Machine-Learning-Based vs. Manually Designed Approaches to Anaphor Resolution: the Best of Two Worlds , 2002 .

[40]  Ellen M. Voorhees Natural Language Processing and Information Retrieval , 1999, SCIE.