Text, Speech and Dialogue

Two kinds of systems have been defined during the long history of WSD: principled systems that define which knowledge types are useful for WSD, and robust systems that use the information sources at hand, such as, dictionaries, light-weight ontologies or hand-tagged corpora. This paper tries to systematize the relation between desired knowledge types and actual information sources. We also compare the results for a wide range of algorithms that have been evaluated on a common test setting in our research group. We hope that this analysis will help change the shift from systems based on information sources to systems based on knowledge sources. This study might also shed some light on semi-automatic acquisition of desired knowledge types from existing resources.

[1]  Lindsay J. Evett,et al.  Text Segmentation Using Reiteration and Collocation , 1998, COLING-ACL.

[2]  Hideki Kozima,et al.  Text Segmentation Based on Similarity between Words , 1993, ACL.

[3]  Alan V. Oppenheim,et al.  Discrete-time Signal Processing. Vol.2 , 2001 .

[4]  Rebecca J. Passonneau,et al.  Combining Multiple Knowledge Sources for Discourse Segmentation , 1995, ACL.

[5]  Frank Smadja,et al.  Retrieving Collocations from Text: Xtract , 1993, CL.

[6]  Igor Mel’čuk,et al.  Dependency Syntax: Theory and Practice , 1987 .

[7]  Marti A. Hearst Multi-Paragraph Segmentation Expository Text , 1994, ACL.

[8]  Igor A. Bolshakov Multifunction Thesaurus For Russian Word Processing , 1994, ANLP.

[9]  Wlodek Zadrozny,et al.  Semantics of Paragraphs , 1991, Comput. Linguistics.

[10]  Stefan Kaufmann Second‐Order Cohesion , 2000, Comput. Intell..

[11]  Brigitte Grau,et al.  Thematic Segmentation of Texts: Two Methods for Two Kinds of Text , 1998, COLING-ACL.

[12]  Makoto Nagao,et al.  Automatic Detection of Discourse Structure by Checking Surface Information in Sentences , 1994, COLING.

[13]  Olivier Ferret How to Thematically Segemt Texts by Using Lexical Cohesion? , 1998, COLING-ACL.

[14]  Piek Vossen,et al.  EuroWordNet: general document , 2002 .

[15]  Tadashi Nomoto,et al.  A Grammatico-Statistical Approach To Discourse Partitioning , 1994, COLING.

[16]  Pavel Nygrýn,et al.  Dialogue Generation of Program Source Codes , 2001, TSD.

[17]  Oskari Heinonen,et al.  Optimal Multi-Paragraph Text Segmentation by Dynamic Programming , 1998, ACL.

[18]  Christian Plaunt,et al.  Subtopic structuring for full-length document access , 1993, SIGIR.

[19]  Gerard Salton,et al.  Automatic Text Structuring and Summarization , 1997, Inf. Process. Manag..