Reinforcement learning of motor skills in high dimensions: A path integral approach
暂无分享,去创建一个
Stefan Schaal | Evangelos Theodorou | Jonas Buchli | Evangelos A. Theodorou | S. Schaal | E. Theodorou | J. Buchli
[1] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[2] B. Øksendal. Stochastic Differential Equations , 1985 .
[3] B. Øksendal. Stochastic differential equations : an introduction with applications , 1987 .
[4] Vijaykumar Gullapalli,et al. A stochastic reinforcement learning algorithm for learning real-valued functions , 1990, Neural Networks.
[5] M. James. Controlled markov processes and viscosity solutions , 1994 .
[6] Robert F. Stengel,et al. Optimal Control and Estimation , 1994 .
[7] J. Yong. Relations among ODEs, PDEs, FSDEs, BSDEs, and FBSDEs , 1997, Proceedings of the 36th IEEE Conference on Decision and Control.
[8] Geoffrey E. Hinton,et al. Using Expectation-Maximization for Reinforcement Learning , 1997, Neural Computation.
[9] Christopher G. Atkeson,et al. Constructive Incremental Learning from Only Local Information , 1998, Neural Computation.
[10] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[11] Shun-ichi Amari,et al. Natural Gradient Learning for Over- and Under-Complete Bases in ICA , 1999, Neural Computation.
[12] Geoffrey E. Hinton,et al. Using EM for Reinforcement Learning , 2000 .
[13] Peter L. Bartlett,et al. Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[14] Jun Nakanishi,et al. Learning Attractor Landscapes for Learning Motor Primitives , 2002, NIPS.
[15] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[16] H. Kappen. Linear theory for control of nonlinear stochastic systems. , 2004, Physical review letters.
[17] H. Kappen. Path integrals and symmetry breaking for optimal control theory , 2005, physics/0505066.
[18] Marc Toussaint,et al. Probabilistic inference for solving discrete and continuous state Markov Decision Processes , 2006, ICML.
[19] Emanuel Todorov,et al. Linearly-solvable Markov decision problems , 2006, NIPS.
[20] Stefan Schaal,et al. Reinforcement Learning for Parameterized Motor Primitives , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.
[21] Hilbert J. Kappen,et al. Stochastic Optimal Control in Continuous Space-Time Multi-Agent Systems , 2006, UAI.
[22] Stefan Schaal,et al. Policy Gradient Methods for Robotics , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[23] Mohammad Ghavamzadeh,et al. Bayesian actor-critic algorithms , 2007, ICML '07.
[24] H. Kappen. An introduction to stochastic control theory, path integrals and reinforcement learning , 2007 .
[25] Emanuel Todorov,et al. General duality between optimal control and estimation , 2008, 2008 47th IEEE Conference on Decision and Control.
[26] Stefan Schaal,et al. Natural Actor-Critic , 2003, Neurocomputing.
[27] Hilbert J. Kappen,et al. Graphical Model Inference in Optimal Control of Stochastic Multi-Agent Systems , 2008, J. Artif. Intell. Res..
[28] Christos Dimitrakakis,et al. Rollout sampling approximate policy iteration , 2008, Machine Learning.
[29] Jan Peters,et al. Machine Learning for motor skills in robotics , 2008, Künstliche Intell..
[30] Stefan Schaal,et al. Learning to Control in Operational Space , 2008, Int. J. Robotics Res..
[31] Stefan Schaal,et al. 2008 Special Issue: Reinforcement learning of motor skills with policy gradients , 2008 .
[32] Jan Peters,et al. Learning motor primitives for robotics , 2009, 2009 IEEE International Conference on Robotics and Automation.
[33] Emanuel Todorov,et al. Efficient computation of optimal actions , 2009, Proceedings of the National Academy of Sciences.
[34] Marc Toussaint,et al. Learning model-free robot control by a Monte Carlo EM algorithm , 2009, Auton. Robots.
[35] Carl E. Rasmussen,et al. Gaussian process dynamic programming , 2009, Neurocomputing.
[36] Vicenç Gómez,et al. Optimal control as a graphical model inference problem , 2009, Machine Learning.
[37] H. Kappen. A path integral approach to agent planning , 2022 .