Model-Free Trajectory-based Policy Optimization with Monotonic Improvement
暂无分享,去创建一个
Jan Peters | Hany Abdulsamad | Gerhard Neumann | Abbas Abdolmaleki | Riad Akrour | Jan Peters | A. Abdolmaleki | G. Neumann | Hany Abdulsamad | R. Akrour
[1] Luís Paulo Reis,et al. Model-Based Relative Entropy Stochastic Search , 2016, NIPS.
[2] Luca Bascetta,et al. Adaptive Step-Size for Policy Gradient Methods , 2013, NIPS.
[3] Jan Peters,et al. Data-Efficient Generalization of Robot Skills with Contextual Policy Search , 2013, AAAI.
[4] Sergey Levine,et al. Learning Neural Network Policies with Guided Policy Search under Unknown Dynamics , 2014, NIPS.
[5] Marc Toussaint,et al. Robot trajectory optimization using approximate inference , 2009, ICML '09.
[6] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[7] Sergey Levine,et al. Learning Complex Neural Network Policies with Trajectory Optimization , 2014, ICML.
[8] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[9] Stefan Schaal,et al. Path integral-based stochastic optimal control for rigid body dynamics , 2009, 2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning.
[10] Yuval Tassa,et al. Iterative local dynamic programming , 2009, 2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning.
[11] Emanuel Todorov,et al. Optimal Control Theory , 2006 .
[12] Anastasios Kyrillidis,et al. Dropping Convexity for Faster Semi-definite Optimization , 2015, COLT.
[13] Carl E. Rasmussen,et al. PILCO: A Model-Based and Data-Efficient Approach to Policy Search , 2011, ICML.
[14] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[15] Yuval Tassa,et al. Stochastic Differential Dynamic Programming , 2010, Proceedings of the 2010 American Control Conference.
[16] Csaba Szepesvári,et al. Algorithms for Reinforcement Learning , 2010, Synthesis Lectures on Artificial Intelligence and Machine Learning.
[17] Daniele Calandriello,et al. Safe Policy Iteration , 2013, ICML.
[18] Jan Peters,et al. Robust policy updates for stochastic optimal control , 2014, 2014 IEEE-RAS International Conference on Humanoid Robots.
[19] Hany Abdulsamad,et al. Model-Free Trajectory Optimization for Reinforcement Learning , 2016, ICML.
[20] Jan Peters,et al. A Survey on Policy Search for Robotics , 2013, Found. Trends Robotics.
[21] Sham M. Kakade,et al. On the sample complexity of reinforcement learning. , 2003 .
[22] Yunpeng Pan,et al. Probabilistic Differential Dynamic Programming , 2014, NIPS.
[23] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[24] Yasemin Altun,et al. Relative Entropy Policy Search , 2010 .
[25] John Langford,et al. Approximately Optimal Approximate Reinforcement Learning , 2002, ICML.
[26] Jan Peters,et al. A biomimetic approach to robot table tennis , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[27] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[28] Jan Peters,et al. Hierarchical Relative Entropy Policy Search , 2014, AISTATS.
[29] Jun Nakanishi,et al. Learning Attractor Landscapes for Learning Motor Primitives , 2002, NIPS.
[30] Paul Wagner,et al. A reinterpretation of the policy oscillation phenomenon in approximate policy iteration , 2011, NIPS.