论文信息 - Co-STAR: A Co-training Style Algorithm for Hyponymy Relation Acquisition from Structured and Unstructured Text - 字舞流文

Co-STAR: A Co-training Style Algorithm for Hyponymy Relation Acquisition from Structured and Unstructured Text

This paper proposes a co-training style algorithm called Co-STAR that acquires hyponymy relations simultaneously from structured and unstructured text. In Co-STAR, two independent processes for hyponymy relation acquisition -- one handling structured text and the other handling unstructured text -- collaborate by repeatedly exchanging the knowledge they acquired about hyponymy relations. Unlike conventional co-training, the two processes in Co-STAR are applied to different source texts and training data. We show the effectiveness of this algorithm through experiments on large-scale hyponymy-relation acquisition from Japanese Wikipedia and Web texts. We also show that Co-STAR is robust against noisy training data.

Jong-Hoon Oh | Kentaro Torisawa | Stijn De Saeger | Ichiro Yamada

[1] Partha Pratim Talukdar,et al. Weakly-Supervised Acquisition of Labeled Class Instances using Graph Random Walks , 2008, EMNLP.

[2] Marti A. Hearst. Automatic Acquisition of Hyponyms from Large Text Corpora , 1992, COLING.

[3] Avrim Blum,et al. The Bottleneck , 2021, Monopsony Capitalism.

[4] Haixun Wang,et al. Towards a Probabilistic Taxonomy of Many Concepts , 2011 .

[5] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[6] Daniel Jurafsky,et al. Semantic Taxonomy Induction from Heterogenous Evidence , 2006, ACL.

[7] Kentaro Torisawa,et al. Hacking Wikipedia for Hyponymy Relation Acquisition , 2008, IJCNLP.

[8] Jong-Hoon Oh,et al. Bilingual Co-Training for Monolingual Hyponymy-Relation Acquisition , 2009, ACL/IJCNLP.

[9] Kentaro Torisawa,et al. Exploiting Wikipedia as External Knowledge for Named Entity Recognition , 2007, EMNLP.

[10] Oren Etzioni,et al. Open Information Extraction from the Web , 2007, CACM.

[11] Kentaro Torisawa,et al. Extracting Hyponyms of Prespecified Hypernyms from Itemizations and Headings in Web Documents , 2004, COLING.

[12] Jens Lehmann,et al. What Have Innsbruck and Leipzig in Common? Extracting Semantics from Wiki Content , 2007, ESWC.

[13] Sujith Ravi,et al. Using structured text for large-scale attribute extraction , 2008, CIKM '08.

[14] Daisuke Kawahara,et al. TSUBAKI: An Open Search Engine Infrastructure for Developing New Information Access Methodology , 2008, IJCNLP.

[15] Patrick Pantel,et al. Automatically Labeling Semantic Classes , 2004, NAACL.

[16] Gerhard Weikum,et al. WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .

[17] Patrick Pantel,et al. Entity Extraction via Ensemble Semantics , 2009, EMNLP.

[18] Kentaro Torisawa,et al. Inducing Gazetteers for Named Entity Recognition by Large-Scale Clustering of Dependency Relations , 2008, ACL.

[19] Satoshi Sekine,et al. Automatic Extraction of Hyponyms from Japanese Newspapers. Using Lexico-syntactic Patterns , 2004, LREC.

[20] Masaki Murata,et al. Large Scale Relation Acquisition Using Class Dependent Patterns , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[21] Eric Crestan,et al. Web-Scale Distributional Similarity and Entity Set Expansion , 2009, EMNLP.

[22] Benjamin Van Durme,et al. Finding Cars, Goddesses and Enzymes: Parametrizable Acquisition of Labeled Instances for Open-Domain Information Extraction , 2008, AAAI.

[23] Zhi-Hua Zhou,et al. Analyzing Co-training Style Algorithms , 2007, ECML.