Tracking the World State with Recurrent Entity Networks

We introduce a new model, the Recurrent Entity Network (EntNet). It is equipped with a dynamic long-term memory which allows it to maintain and update a representation of the state of the world as it receives new data. For language understanding tasks, it can reason on-the-fly as it reads text, not just when it is required to answer a question or respond as is the case for a Memory Network (Sukhbaatar et al., 2015). Like a Neural Turing Machine or Differentiable Neural Computer (Graves et al., 2014; 2016) it maintains a fixed size memory and can learn to perform location and content-based read and write operations. However, unlike those models it has a simple parallel architecture in which several memory locations can be updated simultaneously. The EntNet sets a new state-of-the-art on the bAbI tasks, and is the first method to solve all the tasks in the 10k training examples setting. We also demonstrate that it can solve a reasoning task which requires a large number of supporting facts, which other methods are not able to solve, and can generalize past its training horizon. It can also be practically used on large scale datasets such as Children's Book Test, where it obtains competitive performance, reading the story in a single pass.

[1]  Yoshua Bengio,et al.  Dynamic Neural Turing Machine with Soft and Hard Addressing Schemes , 2016, ArXiv.

[2]  Richard Socher,et al.  Dynamic Memory Networks for Visual and Textual Question Answering , 2016, ICML.

[3]  Sergio Gomez Colmenarejo,et al.  Hybrid computing using a neural network with dynamic external memory , 2016, Nature.

[4]  Jason Weston,et al.  Key-Value Memory Networks for Directly Reading Documents , 2016, EMNLP.

[5]  Joshua B. Tenenbaum,et al.  A Compositional Object-Based Approach to Learning Physical Dynamics , 2016, ICLR.

[6]  Jitendra Malik,et al.  Learning Visual Predictive Models of Physics for Playing Billiards , 2015, ICLR.

[7]  Clément Farabet,et al.  Torch7: A Matlab-like Environment for Machine Learning , 2011, NIPS 2011.

[8]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[9]  Ruslan Salakhutdinov,et al.  Gated-Attention Readers for Text Comprehension , 2016, ACL.

[10]  Jason Weston,et al.  The Goldilocks Principle: Reading Children's Books with Explicit Memory Representations , 2015, ICLR.

[11]  Ting Liu,et al.  Attention-over-Attention Neural Networks for Reading Comprehension , 2016, ACL.

[12]  Phil Blunsom,et al.  Learning to Transduce with Unbounded Memory , 2015, NIPS.

[13]  Rudolf Kadlec,et al.  Text Understanding with the Attention Sum Reader Network , 2016, ACL.

[14]  Jason Weston,et al.  Dialog-based Language Learning , 2016, NIPS.

[15]  Rob Fergus,et al.  Learning Physical Intuition of Block Towers by Example , 2016, ICML.

[16]  Rob Fergus,et al.  Learning Multiagent Communication with Backpropagation , 2016, NIPS.

[17]  Jason Weston,et al.  End-To-End Memory Networks , 2015, NIPS.

[18]  Philip Bachman,et al.  Natural Language Comprehension with the EpiReader , 2016, EMNLP.

[19]  Yann LeCun,et al.  Deep multi-scale video prediction beyond mean square error , 2015, ICLR.

[20]  Tomas Mikolov,et al.  Inferring Algorithmic Patterns with Stack-Augmented Recurrent Nets , 2015, NIPS.

[21]  Yoshua Bengio,et al.  On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.

[22]  Hong Yu,et al.  Reasoning with Memory Augmented Neural Networks for Language Comprehension , 2017, ICLR.

[23]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[24]  Pascal Vincent,et al.  Hierarchical Memory Networks , 2016, ArXiv.

[25]  Xinyun Chen Under Review as a Conference Paper at Iclr 2017 Delving into Transferable Adversarial Ex- Amples and Black-box Attacks , 2016 .

[26]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[27]  Richard S. Zemel,et al.  Gated Graph Sequence Neural Networks , 2015, ICLR.

[28]  Omer Levy,et al.  Published as a conference paper at ICLR 2018 S IMULATING A CTION D YNAMICS WITH N EURAL P ROCESS N ETWORKS , 2018 .

[29]  Jason Weston,et al.  Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks , 2015, ICLR.

[30]  Razvan Pascanu,et al.  Interaction Networks for Learning about Objects, Relations and Physics , 2016, NIPS.

[31]  Jason Weston,et al.  Memory Networks , 2014, ICLR.