论文信息 - B-Learning: A Reinforcement Learning Algorithm, Comparison with Dynamic Programming

B-Learning: A Reinforcement Learning Algorithm, Comparison with Dynamic Programming

In this paper we present a Reinforcement Learning method — B-Learning — for the control of a water production plant. A comparison between B-Learning and Dynamic Programming is provided from both theoretical and performance points of view. It is shown that Reinforcement-based neural control can lead to results comparable in quality to Dynamic Programming-based though less computationnally expensive.

Stéphane Canu | Thibault Langlois

[1] P. Villon,et al. A real-time optimal control algorithm for water treatment plants , 1993, System Modelling and Optimization.

[2] Long Ji Lin,et al. Programming Robots Using Reinforcement Learning and Teaching , 1991, AAAI.

[3] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..

[4] C.W. Anderson,et al. Learning to control an inverted pendulum using neural networks , 1989, IEEE Control Systems Magazine.

[5] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[6] Stéphane Canu,et al. B-Learning: A Reinforcement Learning Variant for the Control of a Plant , 1994 .