Evolving Adaptive LSTM Poker Players for Effective Opponent Exploitation

In many imperfect information games, the ability to exploit the opponent is crucial for achieving high performance. For instance, skilled poker players usually capitalize on various weaknesses in their opponents’ playing patterns and styles to maximize their earnings. Therefore, it is important to enable computer players in such games to identify flaws in opponent strategies and adapt their behaviors to exploit these flaws. This paper presents a genetic algorithm to evolve adaptive LSTM (Long Short Term Memory) poker players featuring effective opponent exploitation. Experimental results in heads-up no-limit Texas Hold’em demonstrate that adaptive LSTM players are able to obtain 40% to 1360% more earnings than cutting-edge game theoretic poker players against opponents with various flawed strategies. In addition, experimental results indicate that adaptive LSTM players evolved through playing against simple and weak rule-based opponents can achieve comparable performance against top gametheoretic poker players. The approach introduced in this paper is a promising start for building adaptive computer players for imperfect information games.

[1]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[2]  Neil Burch,et al.  Heads-up limit hold’em poker is solved , 2015, Science.

[3]  Xun Li and Risto Miikkulainen Evolving Multimodal Behavior Through Subtask and Switch Neural Networks , 2014 .

[4]  Michael H. Bowling,et al.  Bayes' Bluff: Opponent Modelling in Poker , 2005, UAI 2005.

[5]  Julian Togelius,et al.  Evolving Memory Cell Structures for Sequence Learning , 2009, ICANN.

[6]  Tuomas Sandholm,et al.  Hierarchical Abstraction, Distributed Equilibrium Computation, and Post-Processing, with Application to a Champion No-Limit Texas Hold'em Agent , 2015, AAAI Workshop: Computer Poker and Imperfect Information.

[7]  Risto Miikkulainen,et al.  Evolving opponent models for Texas Hold 'Em , 2008, 2008 IEEE Symposium On Computational Intelligence and Games.

[8]  Colin Raffel,et al.  Poker-CNN: A Pattern Learning Strategy for Making Draws and Bets in Poker Games Using Convolutional Networks , 2015, AAAI.

[9]  Troels Bjerre Lund,et al.  A heads-up no-limit Texas Hold'em poker player: discretized betting models and automatically generated equilibrium-finding programs , 2008, AAMAS.

[10]  Michael H. Bowling,et al.  Efficient Nash equilibrium approximation through Monte Carlo counterfactual regret minimization , 2012, AAMAS.

[11]  Tuomas Sandholm,et al.  Solving two-person zero-sum repeated games of incomplete information , 2008, AAMAS.

[12]  Luís Paulo Reis,et al.  Identifying Players Strategies in No Limit Texas Holdém Poker through the Analysis of Individual Moves , 2013, ArXiv.

[13]  Eric Griffin Jackson,et al.  Slumbot NL: Solving Large Games with Counterfactual Regret Minimization Using Sampling and Distributed Processing , 2013, AAAI 2013.

[14]  Tuomas Sandholm,et al.  Game theory-based opponent modeling in large imperfect-information games , 2011, AAMAS.

[15]  Jürgen Schmidhuber,et al.  LSTM: A Search Space Odyssey , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[16]  Sam Ganzfried Bayesian Opponent Exploitation in Imperfect-Information Games , 2018, 2018 IEEE Conference on Computational Intelligence and Games (CIG).

[17]  Robert J. Hilderman,et al.  Algorithms for Evolving No-Limit Texas Hold'em Poker Playing Agents , 2010, IJCCI.

[18]  Jürgen Schmidhuber,et al.  Framewise phoneme classification with bidirectional LSTM and other neural network architectures , 2005, Neural Networks.

[19]  Michael H. Bowling,et al.  Regret Minimization in Games with Incomplete Information , 2007, NIPS.