Reinforcement Learning Using a Grid Based Function Approximator

Function approximators are commonly in use for reinforcement learning systems to cover the untrained situations, when dealing with large problems. By doing so however, theoretical proves on the convergence criteria are still lacking, and practical researches have both positive and negative results. In a recent work [3] with neural networks, the authors reported that the final results did not reach the quality of a Q-table in which no approximation ability was used. In this paper, we continue this research with grid based function approximators. In addition, we consider the required number of state transitions and apply ideas from the field of active learning to reduce this number. We expect the learning process of a similar problem in a real world system to be significantly shorter because state transitions, which represent an object's actual movements, require much more time than basic computational processes.

[1]  Martin A. Riedmiller Concepts and Facilities of a Neural Reinforcement Learning Control Architecture for Technical Process Control , 1999, Neural Computing & Applications.

[2]  M. Hasenjäger,et al.  Active learning in neural networks , 2002 .

[3]  Peter Dayan,et al.  Technical Note: Q-Learning , 2004, Machine Learning.

[4]  Andres El-Fakdi,et al.  Semi-online neural-Q/spl I.bar/leaming for real-time robot learning , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).

[5]  Andreas Zell,et al.  Different criteria for active learning in neural networks: a comparative study , 2002, ESANN.

[6]  Partha Niyogi,et al.  Active Learning for Function Approximation , 1994, NIPS.

[7]  David A. Cohn,et al.  Active Learning with Statistical Models , 1996, NIPS.

[8]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[9]  Richard S. Sutton,et al.  Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding , 1995, NIPS.

[10]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[11]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[12]  Andrew W. Moore,et al.  Generalization in Reinforcement Learning: Safely Approximating the Value Function , 1994, NIPS.

[13]  Charles W. Anderson,et al.  Comparison of CMACs and radial basis functions for local function approximators in reinforcement learning , 1997, Proceedings of International Conference on Neural Networks (ICNN'97).

[14]  David A. Cohn,et al.  Improving generalization with active learning , 1994, Machine Learning.

[15]  Leemon C. Baird,et al.  Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.

[16]  Vipin Kumar Proceedings of the 2002 IEEE International Conference on Data Mining (ICDM 2002), 9-12 December 2002, Maebashi City, Japan , 2002 .

[17]  Marios M. Polycarpou,et al.  An analytical framework for local feedforward networks , 1998, IEEE Trans. Neural Networks.

[18]  Artur Merke,et al.  A Necessary Condition of Convergence for Reinforcement Learning with Function Approximation , 2002, ICML.

[19]  Zhiqiang Zheng,et al.  On active learning for data acquisition , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..