Learning to Drive a Bicycle Using Reinforcement Learning and Shaping

We present and solve a real-world problem of learning to drive a bicycle. We solve the problem by online reinforcement learning using the Sarsa(A)-algorithm. Then we solve the composite problem of learning to balance a bicycle and then drive to 'It goal. In our approach the reinforcement function is independent of the task the agent tries to learn to solve.

[1]  Richard S. Sutton,et al.  Training and Tracking in Robotics , 1985, IJCAI.

[2]  G. Tesauro Practical Issues in Temporal Difference Learning , 1992 .

[3]  Vijaykumar Gullapalli,et al.  Reinforcement learning and its application to control , 1992 .

[4]  Gerald Tesauro,et al.  TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play , 1994, Neural Computation.

[5]  Maja J. Mataric,et al.  Reward Functions for Accelerated Learning , 1994, ICML.

[6]  Marco Colombetti,et al.  Robot Shaping: Developing Autonomous Agents Through Learning , 1994, Artif. Intell..

[7]  Mahesan Niranjan,et al.  On-line Q-learning using connectionist systems , 1994 .

[8]  Gerald Tesauro,et al.  Temporal Difference Learning and TD-Gammon , 1995, J. Int. Comput. Games Assoc..

[9]  Ben J. A. Kröse,et al.  Learning from delayed rewards , 1995, Robotics Auton. Syst..

[10]  Gerald Tesauro,et al.  Temporal difference learning and TD-Gammon , 1995, CACM.

[11]  Gavin Adrian Rummery Problem solving with reinforcement learning , 1995 .

[12]  Richard S. Sutton,et al.  Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding , 1995, NIPS.

[13]  Richard S. Sutton,et al.  Reinforcement Learning with Replacing Eligibility Traces , 2005, Machine Learning.

[14]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[15]  Marco Colombetti,et al.  Robot shaping: The Hamster Experiment , 1996 .

[16]  Marco Colombetti,et al.  Robot Shaping: An Experiment in Behavior Engineering , 1997 .

[17]  Richard S. Sutton,et al.  Roles of Macro-Actions in Accelerating Reinforcement Learning , 1998 .

[18]  Ashwin Ram,et al.  Experiments with Reinforcement Learning in Problems with Continuous State and Action Spaces , 1997, Adapt. Behav..

[19]  Marco Colombetti,et al.  Précis of Robot Shaping: An Experiment in Behavior Engineering , 1997, Adapt. Behav..

[20]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[21]  R. L. Atkinson,et al.  Hilgard's Introduction to Psychology , 1999 .