Exploiting Background Knowledge in Compact Answer Generation for Why-Questions

This paper proposes a novel method for generating compact answers to open-domain why-questions, such as the following answer, “Because deep learning technologies were introduced,” to the question, “Why did Google’s machine translation service improve so drastically?” Although many works have dealt with why-question answering, most have focused on retrieving as answers relatively long text passages that consist of several sentences. Because of their length, such passages are not appropriate to be read aloud by spoken dialog systems and smart speakers; hence, we need to create a method that generates compact answers. We developed a novel neural summarizer for this compact answer generation task. It combines a recurrent neural network-based encoderdecoder model with stacked convolutional neural networks and was designed to effectively exploit background knowledge, in this case a set of causal relations (e.g., “[Microsoft’s machine translation has made great progress over the last few years]effect since [it started to use deep learning.]cause”) that was extracted from a large web data archive (4 billion web pages). Our experimental results show that our method achieved significantly better ROUGE F-scores than existing encoder-decoder models and their variations that were augmented with query-attention and memory networks, which are used to exploit the background knowledge.

[1]  Yuji Matsumoto,et al.  Applying Conditional Random Fields to Japanese Morphological Analysis , 2004, EMNLP.

[2]  Jason Weston,et al.  A Neural Attention Model for Abstractive Sentence Summarization , 2015, EMNLP.

[3]  Jason Weston,et al.  Key-Value Memory Networks for Directly Reading Documents , 2016, EMNLP.

[4]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[5]  Yann Dauphin,et al.  Convolutional Sequence to Sequence Learning , 2017, ICML.

[6]  Balaraman Ravindran,et al.  Diversity driven attention model for query-based abstractive summarization , 2017, ACL.

[7]  Hans van Halteren,et al.  Learning to rank for why-question answering , 2011, Information Retrieval.

[8]  Ming Zhou,et al.  Selective Encoding for Abstractive Sentence Summarization , 2017, ACL.

[9]  Heiga Zen,et al.  WaveNet: A Generative Model for Raw Audio , 2016, SSW.

[10]  Bowen Zhou,et al.  Attentive Pooling Networks , 2016, ArXiv.

[11]  Bowen Zhou,et al.  Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond , 2016, CoNLL.

[12]  Yann Dauphin,et al.  Language Modeling with Gated Convolutional Networks , 2016, ICML.

[13]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[14]  Jong-Hoon Oh,et al.  A Semi-Supervised Learning Approach to Why-Question Answering , 2016, AAAI.

[15]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[16]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[17]  Jong-Hoon Oh,et al.  Multi-Column Convolutional Neural Networks with Causality-Attention for Why-Question Answering , 2017, WSDM.

[18]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[19]  Vladlen Koltun,et al.  Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.

[20]  Alexander M. Rush,et al.  Abstractive Sentence Summarization with Attentive Recurrent Neural Networks , 2016, NAACL.

[21]  Zhen-Hua Ling,et al.  Distraction-based neural networks for modeling documents , 2016, IJCAI 2016.

[22]  Bowen Zhou,et al.  Improved Representation Learning for Question Answer Matching , 2016, ACL.

[23]  Junta Mizuno,et al.  WISDOM X, DISAANA and D-SUMM: Large-scale NLP Systems for Analyzing Textual Big Data , 2016, COLING.

[24]  Peter Jansen,et al.  Creating Causal Embeddings for Question Answering with Minimal Supervision , 2016, EMNLP.

[25]  Ryuichiro Higashinaka,et al.  Corpus-based Question Answering for why-Questions , 2008, IJCNLP.

[26]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[27]  Jian Zhang,et al.  SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.

[28]  Jong-Hoon Oh,et al.  Improving Event Causality Recognition with Multiple Background Knowledge Sources Using Multi-Column Convolutional Neural Networks , 2017, AAAI.

[29]  Yoshua Bengio,et al.  On Using Very Large Target Vocabulary for Neural Machine Translation , 2014, ACL.

[30]  Jong-Hoon Oh,et al.  Semi-Distantly Supervised Neural Model for Generating Compact Answers to Open-Domain Why Questions , 2018, AAAI.

[31]  Roxana Gîrju,et al.  Automatic Detection of Causal Relations for Question Answering , 2003, ACL 2003.

[32]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[33]  Phil Blunsom,et al.  Language as a Latent Variable: Discrete Generative Models for Sentence Compression , 2016, EMNLP.

[34]  Jong-Hoon Oh,et al.  Why-Question Answering using Intra- and Inter-Sentential Causal Relations , 2013, ACL.

[35]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[36]  Christopher D. Manning,et al.  Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.