PILCO: A Model-Based and Data-Efficient Approach to Policy Search
暂无分享,去创建一个
[1] Jeff G. Schneider,et al. Exploiting Model Uncertainty Estimates for Safe Dynamic Control Learning , 1996, NIPS.
[2] Christopher G. Atkeson,et al. A comparison of direct and model-based reinforcement learning , 1997, Proceedings of International Conference on Robotics and Automation.
[3] Visakan Kadirkamanathan,et al. Dual adaptive control of nonlinear stochastic systems using neural networks , 1998, Autom..
[4] Shigenobu Kobayashi,et al. Efficient Non-Linear Control by Combining Q-learning with Local Linear Controllers , 1999, ICML.
[5] Yoav Naveh,et al. Nonlinear Modeling and Control of a Unicycle , 1999 .
[6] Kenji Doya,et al. Reinforcement Learning in Continuous Time and Space , 2000, Neural Computation.
[7] Wei Zhong,et al. Energy and passivity based control of the double inverted pendulum on a cart , 2001, Proceedings of the 2001 IEEE International Conference on Control Applications (CCA'01) (Cat. No.01CH37204).
[8] Jeff G. Schneider,et al. Autonomous helicopter control using reinforcement learning policy search methods , 2001, Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164).
[9] Rémi Coulom,et al. Reinforcement Learning Using Neural Networks, with Applications to Motor Control. (Apprentissage par renforcement utilisant des réseaux de neurones, avec des applications au contrôle moteur) , 2002 .
[10] Agathe Girard,et al. Propagation of uncertainty in Bayesian kernel models - application to multiple-step ahead forecasting , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..
[11] Carl E. Rasmussen,et al. Gaussian Processes in Reinforcement Learning , 2003, NIPS.
[12] Shie Mannor,et al. Bayes Meets Bellman: The Gaussian Process Approach to Temporal Difference Learning , 2003, ICML.
[13] A. Pacut,et al. Model-free off-policy reinforcement learning in continuous environment , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).
[14] Christopher K. I. Williams,et al. Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning) , 2005 .
[15] Martin A. Riedmiller. Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method , 2005, ECML.
[16] Pieter Abbeel,et al. Using inaccurate models in reinforcement learning , 2006, ICML.
[17] Stefan Schaal,et al. Policy Gradient Methods for Robotics , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[18] Nasser M. Nasrabadi,et al. Pattern Recognition and Machine Learning , 2006, Technometrics.
[19] Dieter Fox,et al. Gaussian Processes and Reinforcement Learning for Identification and Control of an Autonomous Blimp , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.
[20] Carl E. Rasmussen,et al. Probabilistic Inference for Fast Learning in Control , 2008, EWRL.
[21] Uwe D. Hanebeck,et al. Analytic moment-based Gaussian process filtering , 2009, ICML '09.
[22] Carl E. Rasmussen,et al. Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.
[23] Tapani Raiko,et al. Variational Bayesian learning of nonlinear hidden state-space models for model predictive control , 2009, Neurocomputing.
[24] Carl E. Rasmussen,et al. Gaussian process dynamic programming , 2009, Neurocomputing.
[25] Marc Peter Deisenroth,et al. Efficient reinforcement learning using Gaussian processes , 2010 .
[26] Alan Fern,et al. Incorporating Domain Models into Bayesian Optimization for RL , 2010, ECML/PKDD.
[27] Carl E. Rasmussen,et al. Learning to Control a Low-Cost Manipulator using Data-Efficient Reinforcement Learning , 2011, Robotics: Science and Systems.
[28] Hado Philip van Hasselt,et al. Insights in reinforcement rearning : formal analysis and empirical evaluation of temporal-difference learning algorithms , 2011 .