论文信息 - Practical Reinforcement Learning in Continuous Spaces

Practical Reinforcement Learning in Continuous Spaces

Dynamic control tasks are good candidates for the application of reinforcement learning techniques. However, many of these tasks inherently have continuous state or action variables. This can cause problems for traditional reinforcement learning algorithms which assume discrete states and actions. In this paper, we introduce an algorithm that safely approximates the value function for continuous state control tasks, and that learns quickly from a small amount of data. We give experimental results using this algorithm to learn policies for both a simulated task and also for a real robot, operating in an unaltered environment. The algorithm works well in a traditional learning setting, and demonstrates extremely good learning when bootstrapped with a small amount of human-provided data.

Leslie Pack Kaelbling | William D. Smart | W. Smart | L. Kaelbling

[1] D. Anderson,et al. Algorithms for minimization without derivatives , 1974 .

[2] R. Cook. Influential Observations in Linear Regression , 1979 .

[3] Sridhar Mahadevan,et al. Enhancing Transfer in Reinforcement Learning by Building Stochastic Models of Robot Actions , 1992, ML.

[4] Andrew W. Moore,et al. Generalization in Reinforcement Learning: Safely Approximating the Value Function , 1994, NIPS.

[5] Geoffrey J. Gordon. Stable Function Approximation in Dynamic Programming , 1995, ICML.

[6] Richard S. Sutton,et al. Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding , 1995, NIPS.

[7] Richard S. Sutton,et al. Reinforcement Learning with Replacing Eligibility Traces , 2005, Machine Learning.

[8] Andrew W. Moore,et al. Gradient Descent for General Reinforcement Learning , 1998, NIPS.

[9] Sebastian Thrun,et al. Issues in Using Function Approximation for Reinforcement Learning , 1999 .