Robot Learning with State-Dependent Exploration
暂无分享,去创建一个
Frank Sehnke | Jürgen Schmidhuber | Martin Felder | Thomas Rückstiess | J. Schmidhuber | Thomas Rückstieß | M. Felder | Frank Sehnke
[1] Peter L. Bartlett,et al. Reinforcement Learning in POMDP's via Direct Gradient Ascent , 2000, ICML.
[2] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[3] Frank Sehnke,et al. Policy Gradients with Parameter-Based Exploration for Control , 2008, ICANN.
[4] Jürgen Schmidhuber,et al. State-Dependent Exploration for Policy Gradient Methods , 2008, ECML/PKDD.
[5] Leonid Peshkin,et al. Reinforcement learning for adaptive routing , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).
[6] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[7] J. Spall. Implementation of the simultaneous perturbation algorithm for stochastic optimization , 1998 .
[8] Marco Wiering,et al. Explorations in efficient reinforcement learning , 1999 .
[9] Takayuki Kanda,et al. Robot behavior adaptation for human-robot interaction based on policy gradient reinforcement learning , 2005, 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[10] Michael I. Jordan,et al. PEGASUS: A policy search method for large MDPs and POMDPs , 2000, UAI.
[11] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[12] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[13] Matthew Saffell,et al. Learning to trade via direct reinforcement , 2001, IEEE Trans. Neural Networks.
[14] Douglas Aberdeen,et al. Policy-Gradient Algorithms for Partially Observable Markov Decision Processes , 2003 .
[15] Shin Ishii,et al. Reinforcement learning for a biped robot based on a CPG-actor-critic method , 2007, Neural Networks.
[16] Stefan Schaal,et al. Natural Actor-Critic , 2003, Neurocomputing.
[17] Stefan Schaal,et al. Policy Gradient Methods for Robotics , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[18] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..