Multi-agent Case-Based Reasoning for Cooperative Reinforcement Learners

In both research fields, Case-Based Reasoning and Reinforcement Learning, the system under consideration gains its expertise from experience. Utilizing this fundamental common ground as well as further characteristics and results of these two disciplines, in this paper we develop an approach that facilitates the distributed learning of behaviour policies in cooperative multi-agent domains without communication between the learning agents. We evaluate our algorithms in a case study in reactive production scheduling.

[1]  Barry Smyth,et al.  Advances in Case-Based Reasoning , 1996, Lecture Notes in Computer Science.

[2]  Derek G. Bridge The Virtue of Reward: Performance, Reinforcement and Discovery in Case-Based Reasoning , 2005, ICCBR.

[3]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[4]  Michael P. Wellman,et al.  Nash Q-Learning for General-Sum Stochastic Games , 2003, J. Mach. Learn. Res..

[5]  Luc Lamontagne,et al.  Case-Based Reasoning Research and Development , 1997, Lecture Notes in Computer Science.

[6]  Peter Dayan,et al.  Technical Note: Q-Learning , 2004, Machine Learning.

[7]  Michael M. Richter,et al.  Adaptivity and Learning , 2003 .

[8]  Martin Lauer,et al.  Reinforcement learning for stochastic cooperative multi-agent-systems , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[9]  Sushil J. Louis,et al.  Learning with case-injected genetic algorithms , 2004, IEEE Transactions on Evolutionary Computation.

[10]  Craig Boutilier,et al.  The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.

[11]  François Charpillet,et al.  Coordination through mutual notification in cooperative multiagent reinforcement learning , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[12]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[13]  Manuela M. Veloso,et al.  Simultaneous Adversarial Multi-Robot Learning , 2003, IJCAI.

[14]  Jay H. Powell,et al.  Evaluating the Effectiveness of Exploration and Accumulated Experience in Automatic Case Elicitation , 2005, ICCBR.

[15]  Collin Green,et al.  Analogical and Case-Based Reasoning for Predicting Satellite Task Schedulability , 2005, ICCBR.

[16]  Martin A. Riedmiller,et al.  Using Machine Learning Techniques in Complex Multi-Agent Domains , 2003 .

[17]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[18]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[19]  Santiago Ontañón,et al.  Collaborative Case Retention Strategies for CBR Agents , 2003, ICCBR.

[20]  Jinwoo Park,et al.  Integrated CBR Framework for Quality Designing and Scheduling in Steel Industry , 2004, ECCBR.

[21]  Michael L. Littman,et al.  Friend-or-Foe Q-learning in General-Sum Games , 2001, ICML.

[22]  Gerald Tesauro,et al.  Extending Q-Learning to General Adaptive Multi-Agent Systems , 2003, NIPS.

[23]  Ashwin Ram,et al.  Experiments with Reinforcement Learning in Problems with Continuous State and Action Spaces , 1997, Adapt. Behav..

[24]  Martin A. Riedmiller,et al.  Reducing policy degradation in neuro-dynamic programming , 2006, ESANN.

[25]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[26]  Martin A. Riedmiller,et al.  A Neural Reinforcement Learning Approach to Learn Local Dispatching Policies in Production Scheduling , 1999, IJCAI.

[27]  Amílcar Cardoso,et al.  Using CBR in the Exploration of Unknown Environments with an Autonomous Agent , 2004, ECCBR.

[28]  Martin A. Riedmiller,et al.  CBR for State Value Function Approximation in Reinforcement Learning , 2005, ICCBR.

[29]  Richard S. Sutton,et al.  Reinforcement Learning , 1992, Handbook of Machine Learning.

[30]  William T. B. Uther,et al.  Adversarial Reinforcement Learning , 2003 .

[31]  Michael Pinedo,et al.  Scheduling: Theory, Algorithms, and Systems , 1994 .

[32]  Enric Plaza Cooperative Reuse for Compositional Cases in Multi-agent Systems , 2005, ICCBR.

[33]  David B. Leake,et al.  Managing Multiple Case Bases: Dimensions and Issues , 2002, FLAIRS.