Using Machine Learning Techniques in Complex Multi-Agent Domains

A challenging current research direction is the design of intelligent software systems — ‘agents’ — that are able to autonomously solve certain tasks within their environment. Application areas of software agents can be found in robotics, as for example agents that control robots to rescue people in dangerous environments, and also in virtual worlds as electronic markets, where intelligent agents have to compete against other market participants, that pursue their own goals.

[1]  Tomohito Andou,et al.  Refinement of Soccer Agents' Positions Using Reinforcement Learning , 1997, RoboCup.

[2]  Craig Boutilier,et al.  The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.

[3]  Oliver Obst,et al.  Spatial Agents Implemented in a Logical Expressible Language , 1999, RoboCup.

[4]  Manuela M. Veloso,et al.  Layered Approach to Learning Client Behaviors in the Robocup Soccer Server , 1998, Appl. Artif. Intell..

[5]  Markus Hannebauer,et al.  Belief-Desire-Intention Deliberation in Artificial Soccer , 1998, AI Mag..

[6]  Martin Lauer,et al.  An Algorithm for Distributed Reinforcement Learning in Cooperative Multi-Agent Systems , 2000, ICML.

[7]  Richard S. Sutton,et al.  Reinforcement Learning , 1992, Handbook of Machine Learning.

[8]  Andrew G. Barto,et al.  Reinforcement learning , 1998 .

[9]  Doina Precup,et al.  Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[10]  Klaus Dorer,et al.  Behavior Networks for Continuous Domains using Situation-Dependent Motivations , 1999, IJCAI.

[11]  Ben J. A. Kröse,et al.  Learning from delayed rewards , 1995, Robotics Auton. Syst..

[12]  J. Filar,et al.  Competitive Markov Decision Processes , 1996 .

[13]  Peter Dayan,et al.  Technical Note: Q-Learning , 2004, Machine Learning.

[14]  Peter Stone,et al.  Reinforcement Learning for 3 vs. 2 Keepaway , 2000, RoboCup.

[15]  Manuela M. Veloso,et al.  Team-Partitioned, Opaque-Transition Reinforced Learning , 1998, RoboCup.

[16]  Martin A. Riedmiller Concepts and Facilities of a Neural Reinforcement Learning Control Architecture for Technical Process Control , 1999, Neural Computing & Applications.

[17]  Richard S. Sutton,et al.  Learning and Sequential Decision Making , 1989 .

[18]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[19]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[20]  Manuela M. Veloso,et al.  Team-partitioned, opaque-transition reinforcement learning , 1999, AGENTS '99.