Multimodal Parameter-exploring Policy Gradients
暂无分享,去创建一个
[1] Frank Sehnke,et al. Policy Gradients with Parameter-Based Exploration for Control , 2008, ICANN.
[2] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[3] Peter L. Bartlett,et al. Reinforcement Learning in POMDP's via Direct Gradient Ascent , 2000, ICML.
[4] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[5] Judy A. Franklin,et al. Biped dynamic walking using reinforcement learning , 1997, Robotics Auton. Syst..
[6] Hans-Paul Schwefel,et al. Evolution and optimum seeking , 1995, Sixth-generation computer technology series.
[7] Stefan Schaal,et al. Natural Actor-Critic , 2003, Neurocomputing.
[8] Stefan Schaal,et al. Policy Gradient Methods for Robotics , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[9] Douglas Aberdeen,et al. Policy-Gradient Algorithms for Partially Observable Markov Decision Processes , 2003 .
[10] Frank Sehnke,et al. Parameter-exploring policy gradients , 2010, Neural Networks.
[11] Tom Schaul,et al. Exploring parameter space in reinforcement learning , 2010, Paladyn J. Behav. Robotics.