Bi-directional Gated Memory Networks for Answer Selection

Answer selection is a crucial subtask of the open domain question answering problem. In this paper, we introduce the Bi-directional Gated Memory Network (BGMN) to model the interactions between question and answer. We match question \((\varvec{P})\) and answer (Q) in two directions. In each direction(for example \({\varvec{P}}\rightarrow {\varvec{Q}}\)), sentence representation of P triggers an iterative attention process that aggregates informative evidence of Q. In each iteration, sentence representation of P and aggregated evidence of Q so far are passed through a gate determining the importance of the two when attend to every step of Q. Finally based on the aggregated evidence, the decision is made through a fully connected network. Experimental results on SemEval-2015 Task 3 dataset demonstrate that our proposed method substantially outperforms several strong baselines. Further experiments show that our model is general and can be applied to other sentence-pair modeling tasks.

[1]  Richard Socher,et al.  Dynamic Coattention Networks For Question Answering , 2016, ICLR.

[2]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[3]  Man Lan,et al.  ECNU: One Stone Two Birds: Ensemble of Heterogenous Measures for Semantic Relatedness and Textual Entailment , 2014, *SEMEVAL.

[4]  Zhifang Sui,et al.  Recognizing Textual Entailment via Multi-task Knowledge Assisted LSTM , 2016, CCL.

[5]  Ming-Wei Chang,et al.  Question Answering Using Enhanced Lexical Semantic Models , 2013, ACL.

[6]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[7]  Richard Socher,et al.  Ask Me Anything: Dynamic Memory Networks for Natural Language Processing , 2015, ICML.

[8]  Alice Lai,et al.  Illinois-LH: A Denotational and Distributional Approach to Semantics , 2014, *SEMEVAL.

[9]  Noah A. Smith,et al.  What is the Jeopardy Model? A Quasi-Synchronous Grammar for QA , 2007, EMNLP.

[10]  Bowen Zhou,et al.  Applying deep learning to answer selection: A study and an open task , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).

[11]  Alessandro Moschitti,et al.  Automatic Feature Engineering for Answer Selection and Extraction , 2013, EMNLP.

[12]  Xiaolong Wang,et al.  Answer Sequence Learning with Neural Networks for Answer Selection in Community Question Answering , 2015, ACL.

[13]  Ali Farhadi,et al.  Bidirectional Attention Flow for Machine Comprehension , 2016, ICLR.

[14]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[15]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[16]  Rob Fergus,et al.  Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-scale Convolutional Architecture , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[17]  Houfeng Wang,et al.  Attentive Interactive Neural Networks for Answer Selection in Community Question Answering , 2017, AAAI.

[18]  Phil Blunsom,et al.  Reasoning about Entailment with Neural Attention , 2015, ICLR.

[19]  Quan Hung Tran,et al.  JAIST: Combining multiple features for Answer Selection in Community Question Answering , 2015, *SEMEVAL.

[20]  Steven Bird,et al.  NLTK: The Natural Language Toolkit , 2002, ACL.

[21]  Katrin Erk,et al.  Representing Meaning with a Combination of Logical Form and Vectors , 2015, ArXiv.

[22]  Phil Blunsom,et al.  Teaching Machines to Read and Comprehend , 2015, NIPS.

[23]  Christopher Potts,et al.  A large annotated corpus for learning natural language inference , 2015, EMNLP.

[24]  Preslav Nakov,et al.  SemEval-2015 Task 3: Answer Selection in Community Question Answering , 2015, *SEMEVAL.

[25]  Jason Weston,et al.  End-To-End Memory Networks , 2015, NIPS.

[26]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[27]  Bowen Zhou,et al.  LSTM-based Deep Learning Models for non-factoid answer selection , 2015, ArXiv.

[28]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[29]  Zhoujun Li,et al.  Knowledge Enhanced Hybrid Neural Network for Text Matching , 2018, AAAI.

[30]  Jason Weston,et al.  A Neural Attention Model for Abstractive Sentence Summarization , 2015, EMNLP.