Improving Generalization in Meta Reinforcement Learning using Neural Objectives
暂无分享,去创建一个
[1] Marcin Andrychowicz,et al. Learning to learn by gradient descent by gradient descent , 2016, NIPS.
[2] Jieyu Zhao,et al. Direct Policy Search and Uncertain Policy Evaluation , 1998 .
[3] Jürgen Schmidhuber,et al. Recurrent World Models Facilitate Policy Evolution , 2018, NeurIPS.
[4] Jitendra Malik,et al. Learning to Optimize , 2016, ICLR.
[5] Jürgen Schmidhuber,et al. Learning to Forget: Continual Prediction with LSTM , 2000, Neural Computation.
[6] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[7] Razvan Pascanu,et al. Policy Distillation , 2015, ICLR.
[8] Tom Schaul,et al. Natural Evolution Strategies , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).
[9] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.
[10] Jürgen Schmidhuber,et al. A ‘Self-Referential’ Weight Matrix , 1993 .
[11] Filip De Turck,et al. VIME: Variational Information Maximizing Exploration , 2016, NIPS.
[12] Jascha Sohl-Dickstein,et al. Learning Unsupervised Learning Rules , 2018, ArXiv.
[13] Marcin Andrychowicz,et al. Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research , 2018, ArXiv.
[14] Ruslan Salakhutdinov,et al. Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning , 2015, ICLR.
[15] Pieter Abbeel,et al. A Simple Neural Attentive Meta-Learner , 2017, ICLR.
[16] Jeff Clune,et al. AI-GAs: AI-generating algorithms, an alternate paradigm for producing general artificial intelligence , 2019, ArXiv.
[17] Sergey Levine,et al. Meta-Learning and Universality: Deep Representations and Gradient Descent can Approximate any Learning Algorithm , 2017, ICLR.
[18] Guy Lever,et al. Deterministic Policy Gradient Algorithms , 2014, ICML.
[19] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[20] Razvan Pascanu,et al. Meta-Learning with Latent Embedding Optimization , 2018, ICLR.
[21] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[22] Sepp Hochreiter,et al. Learning to Learn Using Gradient Descent , 2001, ICANN.
[23] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[24] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[25] Pieter Abbeel,et al. Evolved Policy Gradients , 2018, NeurIPS.
[26] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[27] Daan Wierstra,et al. Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.
[28] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[29] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.
[30] Jitendra Malik,et al. Learning to Optimize Neural Nets , 2017, ArXiv.
[31] Yevgen Chebotar,et al. Meta Learning via Learned Loss , 2019, 2020 25th International Conference on Pattern Recognition (ICPR).
[32] R. J. Williams,et al. On the use of backpropagation in associative reinforcement learning , 1988, IEEE 1988 International Conference on Neural Networks.
[33] Razvan Pascanu,et al. Imagination-Augmented Agents for Deep Reinforcement Learning , 2017, NIPS.
[34] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[35] Michael I. Jordan,et al. RLlib: Abstractions for Distributed Reinforcement Learning , 2017, ICML.
[36] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[37] David Silver,et al. Meta-Gradient Reinforcement Learning , 2018, NeurIPS.
[38] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[39] Zeb Kurth-Nelson,et al. Learning to reinforcement learn , 2016, CogSci.
[40] Jürgen Schmidhuber,et al. A possibility for implementing curiosity and boredom in model-building neural controllers , 1991 .
[41] John Schulman,et al. Gotta Learn Fast: A New Benchmark for Generalization in RL , 2018, ArXiv.
[42] Yoshua Bengio,et al. Bayesian Model-Agnostic Meta-Learning , 2018, NeurIPS.