暂无分享,去创建一个
Jan Peters | Joni Pajarinen | Pascal Klink | Stephan Weigand | Jan Peters | J. Pajarinen | Pascal Klink | Stephan Weigand
[1] Wojciech M. Czarnecki,et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning , 2019, Nature.
[2] Bram Bakker,et al. Reinforcement Learning with Long Short-Term Memory , 2001, NIPS.
[3] Peter Stone,et al. Reinforcement learning , 2019, Scholarpedia.
[4] Sergey Levine,et al. Guided Policy Search , 2013, ICML.
[5] Jürgen Schmidhuber,et al. Policy Gradient Critics , 2007, ECML.
[6] Nolan Wagener,et al. Learning contact-rich manipulation skills with guided policy search , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).
[7] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[8] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[9] Peter Stone,et al. Deep Recurrent Q-Learning for Partially Observable MDPs , 2015, AAAI Fall Symposia.
[10] Marcin Andrychowicz,et al. Asymmetric Actor Critic for Image-Based Robot Learning , 2017, Robotics: Science and Systems.
[11] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[12] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[13] Gerhard Neumann,et al. Guided Deep Reinforcement Learning for Swarm Systems , 2017, ArXiv.
[14] Sergey Levine,et al. Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection , 2016, Int. J. Robotics Res..
[15] Sergey Levine,et al. Learning Complex Neural Network Policies with Trajectory Optimization , 2014, ICML.
[16] Joni Pajarinen,et al. Robotic manipulation of multiple objects as a POMDP , 2014, Artif. Intell..
[17] Wojciech Zaremba,et al. OpenAI Gym , 2016, ArXiv.
[18] David Silver,et al. Memory-based control with recurrent neural networks , 2015, ArXiv.
[19] Sergey Levine,et al. Learning deep neural network policies with continuous memory states , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).
[20] Shimon Whiteson,et al. Deep Variational Reinforcement Learning for POMDPs , 2018, ICML.
[21] Guy Shani,et al. A survey of point-based POMDP solvers , 2013, Autonomous Agents and Multi-Agent Systems.
[22] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[23] Jürgen Schmidhuber,et al. Solving Deep Memory POMDPs with Recurrent Policy Gradients , 2007, ICANN.
[24] Wolfram Burgard,et al. VLocNet++: Deep Multitask Learning for Semantic Visual Localization and Odometry , 2018, IEEE Robotics and Automation Letters.
[25] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[26] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[27] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[28] Sergey Levine,et al. Guided Policy Search via Approximate Mirror Descent , 2016, NIPS.
[29] Sergey Levine,et al. Learning deep control policies for autonomous aerial vehicles with MPC-guided policy search , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).
[30] Reid G. Simmons,et al. Heuristic Search Value Iteration for POMDPs , 2004, UAI.