暂无分享,去创建一个
Martin A. Riedmiller | Sumanth Dathathri | Timothy A. Mann | Martin Riedmiller | Nicolas Heess | Cosmin Paduraru | Rae Jeong | Timothy Mann | Daniel J. Mankowitz | Dan A. Calian | N. Heess | D. Mankowitz | Cosmin Paduraru | Rae Jeong | D. A. Calian | Sumanth Dathathri
[1] H. Francis Song,et al. A Distributional View on Multi-Objective Policy Optimization , 2020, ICML.
[2] Shie Mannor,et al. Exploration-Exploitation in Constrained MDPs , 2020, ArXiv.
[3] OpenAI. Learning Dexterous In-Hand Manipulation. , 2018 .
[4] Nir Levine,et al. An empirical investigation of the challenges of real-world reinforcement learning , 2020, ArXiv.
[5] Pieter Abbeel,et al. Mutual Alignment Transfer Learning , 2017, CoRL.
[6] Marcin Andrychowicz,et al. Sim-to-Real Transfer of Robotic Control with Dynamics Randomization , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[7] Marc G. Bellemare,et al. A Distributional Perspective on Reinforcement Learning , 2017, ICML.
[8] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[9] Raia Hadsell,et al. Value constrained model-free continuous control , 2019, ArXiv.
[10] Matthew W. Hoffman,et al. Distributed Distributional Deterministic Policy Gradients , 2018, ICLR.
[11] Shie Mannor,et al. Scaling Up Robust MDPs using Function Approximation , 2014, ICML.
[12] Shie Mannor,et al. A Bayesian Approach to Robust Reinforcement Learning , 2019, UAI.
[13] Yuval Tassa,et al. DeepMind Control Suite , 2018, ArXiv.
[14] Shie Mannor,et al. Soft-Robust Actor-Critic Policy-Gradient , 2018, UAI.
[15] Wojciech Zaremba,et al. Transfer from Simulation to Real World through Learning Deep Inverse Dynamics Model , 2016, ArXiv.
[16] Martin A. Riedmiller,et al. Robust Reinforcement Learning for Continuous Control with Model Misspecification , 2019, ICLR.
[17] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[18] Shie Mannor,et al. A Deep Hierarchical Approach to Lifelong Learning in Minecraft , 2016, AAAI.
[19] E. Altman. Constrained Markov Decision Processes , 1999 .
[20] Gabriel Dulac-Arnold,et al. Challenges of Real-World Reinforcement Learning , 2019, ArXiv.
[21] Ramana Kumar,et al. Scalable Neural Learning for Verifiable Consistency with Temporal Specifications , 2019 .
[22] Jonathan P. How,et al. Certified Adversarial Robustness for Deep Reinforcement Learning , 2019, CoRL.
[23] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[24] Shie Mannor,et al. Reward Constrained Policy Optimization , 2018, ICLR.
[25] Garud Iyengar,et al. Robust Dynamic Programming , 2005, Math. Oper. Res..
[26] Yuval Tassa,et al. Relative Entropy Regularized Policy Iteration , 2018, ArXiv.
[27] Junhyuk Oh,et al. Balancing Constraints and Rewards with Meta-Gradient D4PG , 2020, ICLR.
[28] Shie Mannor,et al. Policy Gradients with Variance Related Risk Criteria , 2012, ICML.
[29] Divyam Rastogi,et al. Sample-efficient Reinforcement Learning via Difference Models , 2018 .
[30] Shie Mannor,et al. Learning Robust Options , 2018, AAAI.
[31] Pieter Abbeel,et al. Constrained Policy Optimization , 2017, ICML.
[32] Jakub W. Pachocki,et al. Learning dexterous in-hand manipulation , 2018, Int. J. Robotics Res..
[33] Yuval Tassa,et al. Maximum a Posteriori Policy Optimisation , 2018, ICLR.