Policy Learning - A Unified Perspective with Applications in Robotics
暂无分享,去创建一个
[1] Shin Ishii,et al. Reinforcement Learning for CPG-Driven Biped Robot , 2004, AAAI.
[2] Takayuki Kanda,et al. Robot behavior adaptation for human-robot interaction based on policy gradient reinforcement learning , 2005, 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[3] Jin Yu,et al. Natural Actor-Critic for Road Traffic Optimisation , 2006, NIPS.
[4] Geoffrey E. Hinton,et al. Using Expectation-Maximization for Reinforcement Learning , 1997, Neural Computation.
[5] Florentin Wörgötter,et al. Fast Biped Walking with a Sensor-driven Neuronal Controller and Real-time Online Learning , 2006, Int. J. Robotics Res..
[6] Douglas Aberdeen,et al. Policy-Gradient Algorithms for Partially Observable Markov Decision Processes , 2003 .
[7] H. Sebastian Seung,et al. Learning to Walk in 20 Minutes , 2005 .
[8] Peter Stone,et al. Policy gradient reinforcement learning for fast quadrupedal locomotion , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.
[9] Judy A. Franklin,et al. Biped dynamic walking using reinforcement learning , 1997, Robotics Auton. Syst..
[10] Sham M. Kakade,et al. A Natural Policy Gradient , 2001, NIPS.
[11] Shin Ishii,et al. Reinforcement Learning for Biped Locomotion , 2002, ICANN.
[12] Jun Morimoto,et al. Learning CPG Sensory Feedback with Policy Gradient for Biped Locomotion for a Full-Body Humanoid , 2005, AAAI.
[13] Stefan Schaal,et al. Learning Operational Space Control , 2006, Robotics: Science and Systems.
[14] Andrew Zisserman,et al. Advances in Neural Information Processing Systems (NIPS) , 2007 .
[15] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[16] Stefan Schaal,et al. Reinforcement Learning for Humanoid Robotics , 2003 .
[17] V. Gullapalli,et al. Acquiring robot skills via reinforcement learning , 1994, IEEE Control Systems.
[18] Shigenobu Kobayashi,et al. Reinforcement learning for continuous action using stochastic gradient ascent , 1998 .
[19] David C. Sterratt,et al. Does Morphology Influence Temporal Plasticity? , 2002, ICANN.
[20] Stefan Schaal,et al. Natural Actor-Critic , 2003, Neurocomputing.
[21] John N. Tsitsiklis,et al. Actor-Critic Algorithms , 1999, NIPS.
[22] Douglas Aberdeen,et al. POMDPs and Policy Gradients , 2006 .
[23] Jan Peters,et al. Reinforcement Learning of Perceptual Coupling for Motor Primitives , 2008, EWRL 2008.
[24] Oliver G. Selfridge,et al. Real-time learning: a ball on a beam , 1992, [Proceedings 1992] IJCNN International Joint Conference on Neural Networks.