论文信息 - Learning Distributed Representations of Symbolic Structure Using Binding and Unbinding Operations

Learning Distributed Representations of Symbolic Structure Using Binding and Unbinding Operations

Widely used recurrent units, including Long-short Term Memory (LSTM) and Gated Recurrent Unit (GRU), perform well on natural language tasks, but their ability to learn structured representations is still questionable. Exploiting Tensor Product Representations (TPRs) --- distributed representations of symbolic structure in which vector-embedded symbols are bound to vector-embedded structural positions --- we propose the TPRU, a recurrent unit that, at each time step, explicitly executes structural-role binding and unbinding operations to incorporate structural information into learning. Experiments are conducted on both the Logical Entailment task and the Multi-genre Natural Language Inference (MNLI) task, and our TPR-derived recurrent unit provides strong performance with significantly fewer parameters than LSTM and GRU baselines. Furthermore, our learnt TPRU trained on MNLI demonstrates solid generalisation ability on downstream tasks.

[1] Jason Weston,et al. Reading Wikipedia to Answer Open-Domain Questions , 2017, ACL.

[2] Ido Dagan,et al. The Third PASCAL Recognizing Textual Entailment Challenge , 2007, ACL-PASCAL@ACL.

[3] Razvan Pascanu,et al. Relational inductive biases, deep learning, and graph networks , 2018, ArXiv.

[4] Pascal Vincent,et al. Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[6] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .

[7] Jürgen Schmidhuber,et al. Learning to Reason with Third-Order Tensor Products , 2018, NeurIPS.

[8] Yasuhiro Fujiwara,et al. Preventing Gradient Explosions in Gated Recurrent Units , 2017, NIPS.

[9] Jeff A. Bilmes,et al. On Deep Multi-View Representation Learning , 2015, ICML.

[10] Razvan Pascanu,et al. Relational recurrent neural networks , 2018, NeurIPS.

[11] Dan Roth,et al. Learning Question Classifiers , 2002, COLING.

[12] P. Smolensky. Symbolic functions from neural computation , 2012, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[13] Geoffrey E. Hinton,et al. Distributed Representations , 1986, The Philosophy of Artificial Intelligence.

[14] P. Cochat,et al. Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[15] Bing Liu,et al. Mining and summarizing customer reviews , 2004, KDD.

[16] Chris Quirk,et al. Unsupervised Construction of Large Paraphrase Corpora: Exploiting Massively Parallel News Sources , 2004, COLING.

[17] Samuel R. Bowman,et al. A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference , 2017, NAACL.

[18] Christopher D. Manning,et al. Learning Continuous Phrase Representations and Syntactic Parsing with Recursive Neural Networks , 2010 .

[19] Marco Marelli,et al. A SICK cure for the evaluation of compositional distributional semantic models , 2014, LREC.

[20] Jian Zhang,et al. SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.

[21] Jordan B. Pollack,et al. Recursive Distributed Representations , 1990, Artif. Intell..

[22] Richard Evans,et al. Can Neural Networks Understand Logical Entailment? , 2018, ICLR.

[23] Razvan Pascanu,et al. Hyperbolic Attention Networks , 2018, ICLR.

[24] Hakan Inan,et al. Tying Word Vectors and Word Classifiers: A Loss Framework for Language Modeling , 2016, ICLR.