Combining Reinforcement Learning with a Local Control Algorithm

We explore combining reinforcement learning with a hand-crafted local controller in a manner suggested by the chaotic control algorithm of Vincent, Schmitt and Vincent (1994). A closedloop controller is designed using conventional means that creates a domain of attraction about a target state. Chaotic behavior is used or induced to bring the system into this region, at which time the local controller is turned on to bring the system to the target state and stabilize it there. We describe experiments in which we use reinforcement learning instead of, and in addition to, chaotic behavior to learn an efficient policy for driving the system into the local controller’s domain of attraction. Using a simulated double pendulum, we illustrate how this method allows reinforcement learning to be effective in a problem that cannot be easily solved by reinforcement learning alone, and we show how reinforcement learning can improve upon the chaotic control algorithm when the domain of attraction can only be approximately determined. Similar results are shown using the Henon map. This is a simple and effective way of extending reinforcement learning to more difficult problems.

[1]  M. Hénon,et al.  A two-dimensional mapping with a strange attractor , 1976 .

[2]  Katsuhiko Ogata,et al.  Discrete-time control systems , 1987 .

[3]  Tyrone L. Vincent,et al.  A Chaotic Controller for the Double Pendulum , 1994 .

[4]  Mark W. Spong,et al.  The swing up control problem for the Acrobot , 1995 .

[5]  Gavin Adrian Rummery Problem solving with reinforcement learning , 1995 .

[6]  Richard S. Sutton,et al.  Reinforcement Learning with Replacing Eligibility Traces , 2005, Machine Learning.

[7]  Gary Boone,et al.  Efficient reinforcement learning: model-based Acrobot control , 1997, Proceedings of International Conference on Robotics and Automation.

[8]  Stefan Schaal,et al.  Robot Learning From Demonstration , 1997, ICML.

[9]  T. Vincent Control using chaos , 1997 .

[10]  Christopher G. Atkeson,et al.  A comparison of direct and model-based reinforcement learning , 1997, Proceedings of International Conference on Robotics and Automation.

[11]  Gary Boone,et al.  Minimum-time control of the Acrobot , 1997, Proceedings of International Conference on Robotics and Automation.

[12]  Mark W. Spong,et al.  Control of underactuated mechanical systems using switching and saturation , 1997 .

[13]  Thomas L. Vincent Controllable Targets Near a Chaotic Attractor , 1997 .

[14]  T. Vincent,et al.  Nonlinear and Optimal Control Systems , 1997 .

[15]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[16]  Sabino Gadaleta,et al.  Optimal chaos control through reinforcement learning. , 1999, Chaos.