Opponent models allow software agents to assess a multi-agent environment more accurately and therefore improve the agent's performance. This paper makes use of coarse approximations to game-theoretic player representations to improve the performance of software players in Limit Texas Hold 'Em poker. A 10-parameter model, intended to model a combination, or mixture, of various strategies is developed to represent the opponent. A dasiamixture identifierpsila is then evolved using the NEAT neuroevolution method to estimate values of these parameters for arbitrary opponents. To evaluate this approach, two poker players, represented as neural networks, were evolved under the same conditions, one with the mixture identifier, and one without. The player trained with access to the identifier achieved consistently higher and more stable fitness during evolution compared with the player without the identifier. Further, the player with the identifier outplays the other in a heads-up match after training, winning on average 60% of the money at the table. These results demonstrate that opponent modeling is effective even with low-dimensional models and conveys an advantage to players trained to use these models.
[1]
R. Lyndon While,et al.
Learning In RoboCup Keepaway Using Evolutionary Algorithms
,
2002,
GECCO.
[2]
Kevin B. Korb,et al.
Bayesian Poker
,
1999,
UAI.
[3]
Risto Miikkulainen,et al.
Continual Coevolution Through Complexification
,
2002,
GECCO.
[4]
Michael H. Bowling,et al.
Bayes' Bluff: Opponent Modelling in Poker
,
2005,
UAI 2005.
[5]
Jonathan Schaeffer,et al.
Improved Opponent Modeling in Poker
,
2000
.
[6]
Jonathan Schaeffer,et al.
Opponent Modeling in Poker
,
1998,
AAAI/IAAI.
[7]
Michael H. Bowling,et al.
Particle Filtering for Dynamic Agent Modelling in Simplified Poker
,
2007,
AAAI.
[8]
Bret Hoehn,et al.
Effective short-term opponent exploitation in simplified poker
,
2005,
Machine Learning.
[9]
Risto Miikkulainen,et al.
Evolving explicit opponent models in game playing
,
2007,
GECCO '07.
[10]
Manuela M. Veloso,et al.
Planning for Distributed Execution through Use of Probabilistic Opponent Models
,
2002,
AIPS.
[11]
R. Lyndon While,et al.
Adaptive Learning for Poker
,
2000,
GECCO.