Word Sense Disambiguation with Information Retrieval Technique

This paper reports on word sense disambiguation of Korean nouns with information retrieval technique. First, context vectors are constructed using contextual words in training data. Then, the words in the context vector are weighted with local density. Each sense of a target word is represented as ‘Static Sense Vector’ in word space, which is the centroid of the context vectors. Contextual noise is removed using selective sampling. A selective sampling method use information retrieval technique, so as to enhance the discriminative power. We regard training samples as indexed documents and test samples as queries. We can retrieve relevant top-N training samples for a query (a test sample) and construct ‘Dynamic Sense Vector’ using the retrieved training samples. A word sense is estimated using the ‘Static Sense Vector’ and ‘Dynamic Sense Vector’. The Korean SENSEVAL test suit is used for this experiment and our method produces relatively good results.

[1]  Jason S. Chang,et al.  A Concept-based Adaptive Approach to Word Sense Disambiguation , 1998, ACL.

[2]  Kentaro Inui,et al.  Selective Sampling for Example-based Word Sense Disambiguation , 1998, CL.

[3]  Ellen M. Voorhees,et al.  Using WordNet to disambiguate word senses for text retrieval , 1993, SIGIR.

[4]  Adam Kilgarriff,et al.  English Senseval: Report and Results , 2000, LREC.

[5]  David Yarowsky,et al.  A method for disambiguating word senses in a large corpus , 1992, Comput. Humanit..

[6]  Eneko Agirre,et al.  Knowledge Sources for Word Sense Disambiguation , 2001, TSD.

[7]  Yorick Wilks,et al.  Subject-Dependent Co-Occurence and Word Sense Disambiguation , 1991, ACL.

[8]  Young-Chan Park,et al.  Building word knowledge for information retrieval using statistical information = 정보검색을 위한 단어지식의 통계적 구축 , 1997 .

[9]  Eneko Agirre,et al.  Combining Unsupervised Lexical Knowledge Methods for Word Sense Disambiguation , 1997, ACL.

[10]  Lluís Màrquez i Villodre,et al.  Boosting Applied to Word Sense Disambiguation , 2000, ArXiv.

[11]  李幼升,et al.  Ph , 1989 .

[12]  W. Bruce Croft,et al.  Lexical ambiguity and information retrieval , 1992, TOIS.

[13]  Hinrich Schütze,et al.  Automatic Word Sense Discrimination , 1998, Comput. Linguistics.

[14]  David Yarowsky,et al.  Unsupervised Word Sense Disambiguation Rivaling Supervised Methods , 1995, ACL.

[15]  Robert L. Mercer,et al.  Word-Sense Disambiguation Using Statistical Methods , 1991, ACL.

[16]  Steven L. Lytinen Dynamically Combining Syntax and Semantics in Natural Language Processing , 1986, AAAI.

[17]  Eneko Agirre,et al.  Word Sense Disambiguation using Conceptual Density , 1996, COLING.

[18]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .