Expertness based cooperative Q-learning

By using other agents' experiences and knowledge, a learning agent may learn faster, make fewer mistakes, and create some rules for unseen situations. These benefits would be gained if the learning agent can extract proper rules from the other agents' knowledge for its own requirements. One possible way to do this is to have the learner assign some expertness values (intelligence level values) to the other agents and use their knowledge accordingly. Some criteria to measure the expertness of the reinforcement learning agents are introduced. Also, a new cooperative learning method, called weighted strategy sharing (WSS) is presented. In this method, each agent measures the expertness of its teammates and assigns a weight to their knowledge and learns from them accordingly. The presented methods are tested on two Hunter-Prey systems. We consider that the agents are all learning from each other and compare them with those who cooperate only with the more expert ones. Also, the effect of communication noise, as a source of uncertainty, on the cooperative learning method is studied. Moreover, the Q-table of one of the cooperative agents is changed randomly and its effects on the presented methods are examined.

[1]  Jude W. Shavlik,et al.  Incorporating Advice into Agents that Learn from Reinforcements , 1994, AAAI.

[2]  Luca Maria Gambardella,et al.  Ant colony system: a cooperative learning approach to the traveling salesman problem , 1997, IEEE Trans. Evol. Comput..

[3]  Kamal A. Ali A Comparison of Methods for Learning and Combining Evidence From Multiple Models , 1995 .

[4]  Michael P. Wellman,et al.  Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm , 1998, ICML.

[5]  Masahiko Yachida,et al.  Multi-agent reinforcement learning with adaptive mimetism , 1996, Proceedings 1996 IEEE Conference on Emerging Technologies and Factory Automation. ETFA '96.

[6]  Ian Darrell Kelly The development of shared experience learning in a group of mobile robots , 1997 .

[7]  Peter Bakker,et al.  Robot see, robot do: An overview of robot imitation , 1996 .

[8]  Rüdiger Dillmann,et al.  Learning and Communication in Multi-Agent Systems , 1996, ECAI Workshop LDAIS / ICMAS Workshop LIOME.

[9]  John E. Laird,et al.  Learning Procedures from Interactive Natural Language Instructions , 1993, ICML.

[10]  Gillian M. Hayes,et al.  A Robot Controller Using Learning by Imitation , 1994 .

[11]  Majid Nili Ahmadabadi,et al.  Expertness measuring in cooperative learning , 2000, Proceedings. 2000 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2000) (Cat. No.00CH37113).

[12]  Craig Boutilier,et al.  The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.

[13]  Garrison W. Cottrell,et al.  Towards Instructable Connectionist Systems , 1995 .

[14]  Yasuhiro Tanaka,et al.  Speed up reinforcement learning between two agents with adaptive mimetism , 1997, Proceedings of the 1997 IEEE/RSJ International Conference on Intelligent Robot and Systems. Innovative Robotics for Real-World Applications. IROS '97.

[15]  Chris Watkins,et al.  Learning from delayed rewards , 1989 .

[16]  Seiji Yamada,et al.  Experimental comparison of a heterogeneous learning multi-agent system with a homogeneous one , 1996, 1996 IEEE International Conference on Systems, Man and Cybernetics. Information Intelligence and Systems (Cat. No.96CH35929).

[17]  C. Kaynak,et al.  Techniques for Combining Multiple Learners , 1998 .

[18]  Devika Subramanian,et al.  A Multistrategy Learning Scheme for Agent Knowledge Acquisition , 1993, Informatica.

[19]  Masayuki Inaba,et al.  Learning by watching: extracting reusable task knowledge from visual observation of human performance , 1994, IEEE Trans. Robotics Autom..

[20]  S. Salzberg,et al.  Chapter 18 Committees of decision trees , 1996 .

[21]  Amanda J. C. Sharkey,et al.  On Combining Artificial Neural Nets , 1996, Connect. Sci..

[22]  Arthur L. Samuel,et al.  Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..

[23]  Xin Yao,et al.  A cooperative ensemble learning system , 1998, 1998 IEEE International Joint Conference on Neural Networks Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98CH36227).

[24]  Foster J. Provost,et al.  Scaling Up: Distributed Machine Learning with Cooperation , 1996, AAAI/IAAI, Vol. 1.

[25]  R. C. Lacher,et al.  Improving generalization of constructive neural networks using ensembles , 1999 .

[26]  D. Vengerov,et al.  Learning, Cooperation, and Coordination in Multi-Agent Systems , 2000 .

[27]  Jude W. Shavlik,et al.  Creating Advice-Taking Reinforcement Learners , 1998, Machine Learning.

[28]  Richard Alterman,et al.  Multiagent Learning through Collective Memory , 1996 .