The main interest behind the Brainstormers' efiort in the robocup soccer domain is to develop and to apply machine learning tech- niques in complex domains. Especially, we are interested in reinforcement learning methods, where the training signal is only given in terms of suc- cess or failure. Our flnal goal is a learning system, where we only plug in 'win the match' - and our agents learn to generate the appropriate be- haviour. Unfortunately, even from very optimistic complexity estimations it becomes obvious, that in the soccer domain, both conventional solution techniques and also advanced today's reinforcement learning techniques come to their limit - there are more than (108£50) 23 difierent states and more than (1000) 300 difierent policies per agent per half time. This paper describes the new architecture of the Brainstormers team, the improved self-localization using particle fllters and the extensions of the learning algorithm to simultaniously learn with and without ball behaviors.
[1]
Martin A. Riedmiller,et al.
A direct adaptive method for faster backpropagation learning: the RPROP algorithm
,
1993,
IEEE International Conference on Neural Networks.
[2]
Margrit Betke,et al.
Mobile robot localization using landmarks
,
1997,
IEEE Trans. Robotics Autom..
[3]
Wolfram Burgard,et al.
Particle Filters for Mobile Robot Localization
,
2001,
Sequential Monte Carlo Methods in Practice.
[4]
Wolfram Burgard,et al.
Monte Carlo Localization: Efficient Position Estimation for Mobile Robots
,
1999,
AAAI/IAAI.
[5]
Martin Lauer,et al.
An Algorithm for Distributed Reinforcement Learning in Cooperative Multi-Agent Systems
,
2000,
ICML.