论文信息 - Probabilistic Inference for Fast Learning in Control

Probabilistic Inference for Fast Learning in Control

We provide a novel framework for very fast model-based reinforcement learning in continuous state and action spaces. The framework requires probabilistic models that explicitly characterize their levels of confidence. Within this framework, we use flexible, non-parametric models to describe the world based on previously collected experience. We demonstrate learning on the cart-pole problem in a setting where we provide very limited prior knowledge about the task. Learning progresses rapidly, and a good policy is found after only a hand-full of iterations.

Carl E. Rasmussen | Marc Peter Deisenroth | M. Deisenroth | C. Rasmussen

[1] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.

[2] Stefan Schaal,et al. Robot Learning From Demonstration , 1997, ICML.

[3] Christopher G. Atkeson,et al. A comparison of direct and model-based reinforcement learning , 1997, Proceedings of International Conference on Robotics and Automation.

[4] Thomas G. Dietterich. Adaptive computation and machine learning , 1998 .

[5] Kenji Doya,et al. Reinforcement Learning in Continuous Time and Space , 2000, Neural Computation.

[6] C. Rasmussen,et al. Gaussian Process Priors with Uncertain Inputs - Application to Multiple-Step Ahead Time Series Forecasting , 2002, NIPS.

[7] Carl E. Rasmussen,et al. Gaussian Processes in Reinforcement Learning , 2003, NIPS.

[8] Christopher K. I. Williams,et al. Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning) , 2005 .

[9] Pieter Abbeel,et al. Exploration and apprenticeship learning in reinforcement learning , 2005, ICML.

[10] Zoubin Ghahramani,et al. Sparse Gaussian Processes using Pseudo-inputs , 2005, NIPS.

[11] Pieter Abbeel,et al. Using inaccurate models in reinforcement learning , 2006, ICML.

[12] Malte Kuß,et al. Gaussian process models for robust regression, classification, and reinforcement learning , 2006 .

[13] Pascal Poupart,et al. Model-based Bayesian Reinforcement Learning in Partially Observable Domains , 2008, ISAIM.

[14] Stefan Schaal,et al. Learning to Control in Operational Space , 2008, Int. J. Robotics Res..

[15] Carl E. Rasmussen,et al. Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.