Enhancing the episodic natural actor-critic algorithm by a regularisation term to stabilize learning of control structures
暂无分享,去创建一个
Kurt Geihs | Martin A. Riedmiller | Roland Reichle | Sascha Lange | Andreas Witsch | S. Lange | K. Geihs | R. Reichle | A. Witsch
[1] J. Shynk. Adaptive IIR filtering , 1989, IEEE ASSP Magazine.
[2] P. Regalia. Adaptive IIR Filtering in Signal Processing and Control , 1994 .
[3] D. Harville. Matrix Algebra From a Statistician's Perspective , 1998 .
[4] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[5] Herbert Jaeger,et al. The''echo state''approach to analysing and training recurrent neural networks , 2001 .
[6] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[7] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[8] Stefan Schaal,et al. Natural Actor-Critic , 2003, Neurocomputing.
[9] Christopher M. Bishop,et al. Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .
[10] Nasser M. Nasrabadi,et al. Pattern Recognition and Machine Learning , 2006, Technometrics.
[11] Martin A. Riedmiller,et al. Evaluation of Policy Gradient Methods and Variants on the Cart-Pole Benchmark , 2007, 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning.
[12] Jan Peters,et al. Machine Learning for motor skills in robotics , 2008, Künstliche Intell..
[13] Stefan Schaal,et al. 2008 Special Issue: Reinforcement learning of motor skills with policy gradients , 2008 .
[14] K. Geihs,et al. Carpe Noctem 2009 , 2009 .