Combining Model-Based and Model-Free Updates for Trajectory-Centric Reinforcement Learning
暂无分享,去创建一个
Gaurav S. Sukhatme | Sergey Levine | Stefan Schaal | Yevgen Chebotar | Karol Hausman | Marvin Zhang | S. Levine | S. Schaal | Karol Hausman | Marvin Zhang | Yevgen Chebotar | G. Sukhatme
[1] Jan Peters,et al. A Survey on Policy Search for Robotics , 2013, Found. Trends Robotics.
[2] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[3] Jonas Buchli,et al. Learning of closed-loop motion control , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[4] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[5] Stefan Schaal,et al. Learning and generalization of motor skills by learning from demonstration , 2009, 2009 IEEE International Conference on Robotics and Automation.
[6] Nolan Wagener,et al. Learning contact-rich manipulation skills with guided policy search , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).
[7] Jun Nakanishi,et al. Control, Planning, Learning, and Imitation with Dynamic Movement Primitives , 2003 .
[8] Stefan Schaal,et al. 2008 Special Issue: Reinforcement learning of motor skills with policy gradients , 2008 .
[9] Yunpeng Pan,et al. Probabilistic Differential Dynamic Programming , 2014, NIPS.
[10] Stefan Schaal,et al. A Generalized Path Integral Control Approach to Reinforcement Learning , 2010, J. Mach. Learn. Res..
[11] Sergey Levine,et al. Guided Policy Search via Approximate Mirror Descent , 2016, NIPS.
[12] Jan Peters,et al. Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..
[13] Carl E. Rasmussen,et al. Learning to Control a Low-Cost Manipulator using Data-Efficient Reinforcement Learning , 2011, Robotics: Science and Systems.
[14] Sergey Levine,et al. Reset-free guided policy search: Efficient deep reinforcement learning with stochastic initial states , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[15] Sergey Levine,et al. Path integral guided policy search , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[16] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.
[17] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[18] Sergey Levine,et al. Learning Neural Network Policies with Guided Policy Search under Unknown Dynamics , 2014, NIPS.
[19] Carl E. Rasmussen,et al. Gaussian Processes for Data-Efficient Learning in Robotics and Control , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[20] Oliver Kroemer,et al. Learning sequential motor tasks , 2013, 2013 IEEE International Conference on Robotics and Automation.
[21] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[22] Hany Abdulsamad,et al. Model-Free Trajectory Optimization for Reinforcement Learning , 2016, ICML.
[23] Yasemin Altun,et al. Relative Entropy Policy Search , 2010 .
[24] Yuval Tassa,et al. Synthesis and stabilization of complex behaviors through online trajectory optimization , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[25] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[26] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[27] Sergey Levine,et al. Continuous Deep Q-Learning with Model-based Acceleration , 2016, ICML.
[28] Yuval Tassa,et al. Learning Continuous Control Policies by Stochastic Value Gradients , 2015, NIPS.