The problem of discovering complex control policies for con- tinuous tasks is often best addressed by decomposing the policy into several simpler parts. While a genetic algorithm usually decomposes the task by encoding a chromosome over multiple genes, it faces the dicult credit assignment problem of evaluating how a single allele in a chro- mosome contributes to the full solution. Typically a single evaluation function is used for the entire chromosome, implicitly giving each allele in the chromosome the same evaluation. This method is inecient be- cause an allele will get credit for the contribution of all the other alleles as well. Accurately measuring the fitness of an individual allele in such a large search space requires many trials and may result in stagnation. This paper instead proposes turning this single complex search problem into a multi-agent search problem, where each agent has the simpler task of dis- covering a suitable allele. Gene-specific evaluation functions can then be created that have better theoretical properties than a single evaluation function over all genes. Even though each gene has its own evaluation function, through the process of self-organization a set of compatible al- leles can be found to form a high performing chromosome. The method is tested on the double-pole balancing problem, showing that agents that self-organize using gene-specific evaluation functions can create a suc- cessful control policy in 20% fewer trials than the best existing genetic algorithms.
[1]
Kagan Tumer,et al.
Learning sequences of actions in collectives of autonomous agents
,
2002,
AAMAS '02.
[2]
Risto Miikkulainen,et al.
Forming Neural Networks Through Efficient and Adaptive Coevolution
,
1997,
Evolutionary Computation.
[3]
Christopher M. Bishop,et al.
Neural networks for pattern recognition
,
1995
.
[4]
Kagan Tumer,et al.
Using Collective Intelligence to Route Internet Traffic
,
1998,
NIPS.
[5]
Risto Miikkulainen,et al.
Efficient Reinforcement Learning Through Evolving Neural Network Topologies
,
2002,
GECCO.
[6]
Joydeep Ghosh,et al.
Design and control of large collections of learning agents
,
2003
.
[7]
Andrew G. Barto,et al.
Improving Elevator Performance Using Reinforcement Learning
,
1995,
NIPS.