Improving Event Causality Recognition with Multiple Background Knowledge Sources Using Multi-Column Convolutional Neural Networks

We propose a method for recognizing such event causalities as “smoke cigarettes” → “die of lung cancer” using background knowledge taken from web texts as well as original sentences from which candidates for the causalities were extracted. We retrieve texts related to our event causality candidates from four billion web pages by three distinct methods, including a why-question answering system, and feed them to our multi-column convolutional neural networks. This allows us to identify the useful background knowledge scattered in web texts and effectively exploit the identified knowledge to recognize event causalities. We empirically show that the combination of our neural network architecture and background knowledge significantly improves average precision, while the previous state-of-the-art method gains just a small benefit from such background knowledge.

[1]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[2]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[3]  Yutaka Kidawara,et al.  Toward Future Scenario Generation: Extracting Event Causality Exploiting Semantic Relation, Context, and Association Features , 2014, ACL.

[4]  Jun Zhao,et al.  Distant Supervision for Relation Extraction via Piecewise Convolutional Neural Networks , 2015, EMNLP.

[5]  Yoshua Bengio,et al.  Practical Recommendations for Gradient-Based Training of Deep Architectures , 2012, Neural Networks: Tricks of the Trade.

[6]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[7]  Jong-Hoon Oh,et al.  Two-Stage Method for Large-Scale Acquisition of Contradiction Pattern Pairs using Entailment , 2013, EMNLP.

[8]  Wenpeng Yin,et al.  Convolutional Neural Network for Paraphrase Identification , 2015, NAACL.

[9]  Jürgen Schmidhuber,et al.  Multi-column deep neural networks for image classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Daniel Jurafsky,et al.  Distant supervision for relation extraction without labeled data , 2009, ACL.

[11]  Kentaro Torisawa Acquiring Inference Rules with Temporal Constraints by Using Japanese Coordinated Sentences and Noun-Verb Co-occurrences , 2006, HLT-NAACL.

[12]  Jong-Hoon Oh,et al.  Generating Event Causality Hypotheses through Semantic Relations , 2015, AAAI.

[13]  Mehwish Riaz,et al.  Another Look at Causality: Discovering Scenario-Specific Contingency Relationships with No Supervision , 2010, 2010 IEEE Fourth International Conference on Semantic Computing.

[14]  Matthew Richardson,et al.  MCTest: A Challenge Dataset for the Open-Domain Machine Comprehension of Text , 2013, EMNLP.

[15]  Phil Blunsom,et al.  A Convolutional Neural Network for Modelling Sentences , 2014, ACL.

[16]  Yann LeCun,et al.  Learning a similarity metric discriminatively, with application to face verification , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[17]  Alex Graves,et al.  Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.

[18]  Ming Zhou,et al.  Question Answering over Freebase with Multi-Column Convolutional Neural Networks , 2015, ACL.

[19]  Peter Clark,et al.  Learning Biological Processes with Global Constraints , 2013, EMNLP.

[20]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[21]  Kira Radinsky,et al.  Learning causality for news events prediction , 2012, WWW.

[22]  Jong-Hoon Oh,et al.  Why-Question Answering using Intra- and Inter-Sentential Causal Relations , 2013, ACL.

[23]  Bowen Zhou,et al.  Classifying Relations by Ranking with Convolutional Neural Networks , 2015, ACL.

[24]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[25]  Yuji Matsumoto,et al.  Two-Phased Event Relation Acquisition: Coupling the Relation-Oriented and Argument-Oriented Approaches , 2008, COLING.

[26]  Jong-Hoon Oh,et al.  Excitatory or Inhibitory: A New Semantic Orientation Extracts Contradiction and Causality from the Web , 2012, EMNLP.

[27]  Jong-Hoon Oh,et al.  Intra-Sentential Subject Zero Anaphora Resolution using Multi-Column Convolutional Neural Network , 2016, EMNLP.

[28]  Peter Clark,et al.  Modeling Biological Processes for Reading Comprehension , 2014, EMNLP.

[29]  Gerhard Weikum,et al.  WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .

[30]  Tong Zhang,et al.  Effective Use of Word Order for Text Categorization with Convolutional Neural Networks , 2014, NAACL.

[31]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[32]  Dan Roth,et al.  Minimally Supervised Event Causality Identification , 2011, EMNLP.

[33]  Jong-Hoon Oh,et al.  Million-scale Derivation of Semantic Relations from a Manually Constructed Predicate Taxonomy , 2014, COLING.

[34]  Jong-Hoon Oh,et al.  Large-Scale Acquisition of Entailment Pattern Pairs by Exploiting Transitivity , 2015, EMNLP.

[35]  Razvan Pascanu,et al.  Theano: new features and speed improvements , 2012, ArXiv.

[36]  Jong-Hoon Oh,et al.  Multi-Column Convolutional Neural Networks with Causality-Attention for Why-Question Answering , 2017, WSDM.

[37]  Ralph Grishman,et al.  Relation Extraction: Perspective from Convolutional Neural Networks , 2015, VS@HLT-NAACL.