论文信息 - Learning to Drive a Bicycle Using Reinforcement Learning and Shaping - 字舞流文

Learning to Drive a Bicycle Using Reinforcement Learning and Shaping

We present and solve a real-world problem of learning to drive a bicycle. We solve the problem by online reinforcement learning using the Sarsa(A)-algorithm. Then we solve the composite problem of learning to balance a bicycle and then drive to 'It goal. In our approach the reinforcement function is independent of the task the agent tries to learn to solve.

Preben Alstrøm | Jette Randløv | J. Randløv | P. Alstrøm

[1] Richard S. Sutton,et al. Training and Tracking in Robotics , 1985, IJCAI.

[2] G. Tesauro. Practical Issues in Temporal Difference Learning , 1992 .

[3] Vijaykumar Gullapalli,et al. Reinforcement learning and its application to control , 1992 .

[4] Gerald Tesauro,et al. TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play , 1994, Neural Computation.

[5] Maja J. Mataric,et al. Reward Functions for Accelerated Learning , 1994, ICML.

[6] Marco Colombetti,et al. Robot Shaping: Developing Autonomous Agents Through Learning , 1994, Artif. Intell..

[7] Mahesan Niranjan,et al. On-line Q-learning using connectionist systems , 1994 .

[8] Gerald Tesauro,et al. Temporal Difference Learning and TD-Gammon , 1995, J. Int. Comput. Games Assoc..

[9] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..

[10] Gerald Tesauro,et al. Temporal difference learning and TD-Gammon , 1995, CACM.

[11] Gavin Adrian Rummery. Problem solving with reinforcement learning , 1995 .

[12] Richard S. Sutton,et al. Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding , 1995, NIPS.

[13] Richard S. Sutton,et al. Reinforcement Learning with Replacing Eligibility Traces , 2005, Machine Learning.

[14] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[15] Marco Colombetti,et al. Robot shaping: The Hamster Experiment , 1996 .

[16] Marco Colombetti,et al. Robot Shaping: An Experiment in Behavior Engineering , 1997 .

[17] Richard S. Sutton,et al. Roles of Macro-Actions in Accelerating Reinforcement Learning , 1998 .

[18] Ashwin Ram,et al. Experiments with Reinforcement Learning in Problems with Continuous State and Action Spaces , 1997, Adapt. Behav..

[19] Marco Colombetti,et al. Précis of Robot Shaping: An Experiment in Behavior Engineering , 1997, Adapt. Behav..

[20] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .

[21] R. L. Atkinson,et al. Hilgard's Introduction to Psychology , 1999 .