Measuring Generalization Performance in Coevolutionary Learning

Coevolutionary learning involves a training process where training samples are instances of solutions that interact strategically to guide the evolutionary (learning) process. One main research issue is with the generalization performance, i.e., the search for solutions (e.g., input-output mappings) that best predict the required output for any new input that has not been seen during the evolutionary process. However, there is currently no such framework for determining the generalization performance in coevolutionary learning even though the notion of generalization is well-understood in machine learning. In this paper, we introduce a theoretical framework to address this research issue. We present the framework in terms of game-playing although our results are more general. Here, a strategy's generalization performance is its average performance against all test strategies. Given that the true value may not be determined by solving analytically a closed-form formula and is computationally prohibitive, we propose an estimation procedure that computes the average performance against a small sample of random test strategies instead. We perform a mathematical analysis to provide a statistical claim on the accuracy of our estimation procedure, which can be further improved by performing a second estimation on the variance of the random variable. For game-playing, it is well-known that one is more interested in the generalization performance against a biased and diverse sample of "good" test strategies. We introduce a simple approach to obtain such a test sample through the multiple partial enumerative search of the strategy space that does not require human expertise and is generally applicable to a wide range of domains. We illustrate the generalization framework on the coevolutionary learning of the iterated prisoner's dilemma (IPD) games. We investigate two definitions of generalization performance for the IPD game based on different performance criteria, e.g., in terms of the number of wins based on individual outcomes and in terms of average payoff. We show that a small sample of test strategies can be used to estimate the generalization performance. We also show that the generalization performance using a biased and diverse set of "good" test strategies is lower compared to the unbiased case for the IPD game. This is the first time that generalization is defined and analyzed rigorously in coevolutionary learning. The framework allows the evaluation of the generalization performance of any coevolutionary learning system quantitatively.

[1]  Xin Yao,et al.  On Evolving Robust Strategies for Iterated Prisoner's Dilemma , 1993, Evo Workshops.

[2]  Michael H. Bowling,et al.  Convergence and No-Regret in Multiagent Learning , 2004, NIPS.

[3]  Stefano Nolfi,et al.  Co-evolving predator and prey robots , 1998, Artificial Life.

[4]  J. Pollack,et al.  Coevolving the "Ideal" Trainer: Application to the Discovery of Cellular Automata Rules , 1998 .

[5]  X. Yao,et al.  How important is your reputation in a multi-agent environment , 1999, IEEE SMC'99 Conference Proceedings. 1999 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No.99CH37028).

[6]  J. Pollack,et al.  Coevolutionary dynamics in a minimal substrate , 2001 .

[7]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[8]  David B. Fogel,et al.  On the Relationship between the Duration of an Encounter and the Evolution of Cooperation in the Iterated Prisoner's Dilemma , 1995, Evolutionary Computation.

[9]  Paul J. Darwen,et al.  Co-Evolutionary Learning by Automatic Modularisation with Speciation , 1996 .

[10]  Jason Noble,et al.  Finding Robust Texas Hold'em Poker Strategies Using Pareto Coevolution and Deterministic Crowding , 2002, ICMLA.

[11]  David B. Fogel,et al.  An introduction to simulated evolutionary optimization , 1994, IEEE Trans. Neural Networks.

[12]  Kenneth O. Stanley and Joseph Reisinger and Risto Miikkulainen,et al.  The Dominance Tournament Method of Monitoring Progress in Coevolution , 2002 .

[13]  R. Axelrod More Effective Choice in the Prisoner's Dilemma , 1980 .

[14]  Richard K. Belew,et al.  Coevolutionary search among adversaries , 1997 .

[15]  Siang Yew Chong,et al.  Observing the evolution of neural networks learning to play the game of Othello , 2005, IEEE Transactions on Evolutionary Computation.

[16]  W. Daniel Hillis,et al.  Co-evolving parasites improve simulated evolution as an optimization procedure , 1990 .

[17]  S. Luke,et al.  When Coevolutionary Algorithms Exhibit Evolutionary Dyna mics , 2002 .

[18]  Dave Cliff,et al.  Tracking the Red Queen: Measurements of Adaptive Progress in Co-Evolutionary Simulations , 1995, ECAL.

[19]  David B. Fogel,et al.  Evolving Behaviors in the Iterated Prisoner's Dilemma , 1993, Evolutionary Computation.

[20]  Edwin D. de Jong,et al.  The MaxSolve algorithm for coevolution , 2005, GECCO '05.

[21]  Simon M. Lucas,et al.  Coevolution versus self-play temporal difference learning for acquiring position evaluation in small-board go , 2005, IEEE Transactions on Evolutionary Computation.

[22]  W. Hamilton,et al.  The Evolution of Cooperation , 1984 .

[23]  Jordan B. Pollack,et al.  Pareto Optimality in Coevolutionary Learning , 2001, ECAL.

[24]  Kenneth A. De Jong,et al.  Cooperative Coevolution: An Architecture for Evolving Coadapted Subcomponents , 2000, Evolutionary Computation.

[25]  Thomas Bäck,et al.  Evolutionary computation: comments on the history and current state , 1997, IEEE Trans. Evol. Comput..

[26]  Xin Yao,et al.  Evolutionary computation : theory and applications , 1999 .

[27]  Jordan B. Pollack,et al.  A Game-Theoretic Memory Mechanism for Coevolution , 2003, GECCO.

[28]  Kenneth A. De Jong,et al.  Relationships between internal and external metrics in co-evolution , 2005, 2005 IEEE Congress on Evolutionary Computation.

[29]  Xin Yao,et al.  Co-Evolution in Iterated Prisoner's Dilemma with Intermediate Levels of Cooperation: Application to Missile Defense , 2002, Int. J. Comput. Intell. Appl..

[30]  Jordan B. Pollack,et al.  A game-theoretic and dynamical-systems analysis of selection methods in coevolution , 2005, IEEE Transactions on Evolutionary Computation.

[31]  Edwin D. de Jong,et al.  The Incremental Pareto-Coevolution Archive , 2004, GECCO.

[32]  J. Pollack,et al.  Challenges in coevolutionary learning: arms-race dynamics, open-endedness, and medicocre stable states , 1998 .

[33]  Stefano Nolfi,et al.  God Save the Red Queen! Competition in Co-Evolutionary Robotics , 1997 .

[34]  D. Fogel The evolution of intelligent decision making in gaming , 1991 .

[35]  Michael Grüninger,et al.  Introduction , 2002, CACM.

[36]  Boris Gnedenko,et al.  Theory of Probability , 1963 .

[37]  Xin Yao,et al.  Behavioral diversity, choices and noise in the iterated prisoner's dilemma , 2005, IEEE Transactions on Evolutionary Computation.

[38]  James P. Crutchfield,et al.  Resource sharing and coevolution in evolving cellular automata , 1999, IEEE Trans. Evol. Comput..

[39]  Robert Axelrod,et al.  The Evolution of Strategies in the Iterated Prisoner's Dilemma , 2001 .

[40]  Harold I. Jacobson THE MAXIMUM VARIANCE OF RESTRICTED UNIMODAL DISTRIBUTIONS , 1969 .

[41]  Xin Yao,et al.  Speciation as automatic categorical modularization , 1997, IEEE Trans. Evol. Comput..

[42]  Xin Yao,et al.  Does extra genetic diversity maintain escalation in a co-evolutionary arms race , 2000 .

[43]  X. Yao,et al.  How to Make Best Use of Evolutionary Learning , 1996 .

[44]  Andries Petrus Engelbrecht,et al.  Particle swarm optimization approaches to coevolve strategies for the iterated prisoner's dilemma , 2005, IEEE Transactions on Evolutionary Computation.

[45]  Seth Bullock,et al.  Combating Coevolutionary Disengagement by Reducing Parasite Virulence , 2004, Evolutionary Computation.

[46]  R. Paul Wiegand,et al.  Robustness in cooperative coevolution , 2006, GECCO '06.

[47]  Richard K. Belew,et al.  New Methods for Competitive Coevolution , 1997, Evolutionary Computation.

[48]  Xin Yao,et al.  Why more choices cause less cooperation in iterated prisoner's dilemma , 2001, Proceedings of the 2001 Congress on Evolutionary Computation (IEEE Cat. No.01TH8546).

[49]  F. Alajaji,et al.  c ○ Copyright by , 1998 .

[50]  Sung-Bae Cho,et al.  Emergence of cooperative coalition in NIPD game with localization of interaction and learning , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).

[51]  Bryant A. Julstrom Effects of Contest Length and Noise on Reciprocal Altruism, Cooperation, and Payoffs in the Iterated Prisoner's Dilemma , 1997, ICGA.

[52]  J. Bendor,et al.  Effective Choice in the Prisoner ' s Dilemma , 2007 .

[53]  Risto Miikkulainen,et al.  Competitive Coevolution through Evolutionary Complexification , 2011, J. Artif. Intell. Res..

[54]  David B. Fogel,et al.  Evolution, neural networks, games, and intelligence , 1999, Proc. IEEE.

[55]  Edwin D. de Jong,et al.  Ideal Evaluation from Coevolution , 2004, Evolutionary Computation.

[56]  Jordan B. Pollack,et al.  Towards Metrics and Visualizations Sensitive to Coevolutionary Failures , 2005, AAAI Fall Symposium: Coevolutionary and Coadaptive Systems.

[57]  D. Fogel,et al.  Evolving continuous behaviors in the Iterated Prisoner's Dilemma. , 1996, Bio Systems.