论文信息 - Opponent modeling and exploitation in poker using evolved recurrent neural networks

Opponent modeling and exploitation in poker using evolved recurrent neural networks

As a classic example of imperfect information games, Heads-Up No-limit Texas Holdem (HUNL) has been studied extensively in recent years. While state-of-the-art approaches based on Nash equilibrium have been successful, they lack the ability to model and exploit opponents effectively. This paper presents an evolutionary approach to discover opponent models based on recurrent neural networks (LSTM) and Pattern Recognition Trees. Experimental results showed that poker agents built in this method can adapt to opponents they have never seen in training and exploit weak strategies far more effectively than Slumbot 2017, one of the cutting-edge Nash-equilibrium-based poker agents. In addition, agents evolved through playing against relatively weak rule-based opponents tied statistically with Slumbot in heads-up matches. Thus, the proposed approach is a promising new direction for building high-performance adaptive agents in HUNL and other imperfect information games.

Risto Miikkulainen | Xun Li

[1] Michael H. Bowling,et al. Data Biased Robust Counter Strategies , 2009, AISTATS.

[2] Jonathan Schaeffer,et al. Opponent Modeling in Poker , 1998, AAAI/IAAI.

[3] Tuomas Sandholm,et al. Solving two-person zero-sum repeated games of incomplete information , 2008, AAMAS.

[4] Michael H. Bowling,et al. Bayes' Bluff: Opponent Modelling in Poker , 2005, UAI 2005.

[5] Tuomas Sandholm,et al. Hierarchical Abstraction, Distributed Equilibrium Computation, and Post-Processing, with Application to a Champion No-Limit Texas Hold'em Agent , 2015, AAAI Workshop: Computer Poker and Imperfect Information.

[6] Tuomas Sandholm,et al. Game theory-based opponent modeling in large imperfect-information games , 2011, AAMAS.

[7] Tuomas Sandholm,et al. Safe and Nested Subgame Solving for Imperfect-Information Games , 2017, NIPS.

[8] Kevin B. Korb,et al. Bayesian Poker , 1999, UAI.

[9] Kevin Waugh,et al. DeepStack: Expert-level artificial intelligence in heads-up no-limit poker , 2017, Science.

[10] Neil Burch,et al. Heads-up limit hold’em poker is solved , 2015, Science.

[11] Kurt Driessens,et al. Bayes-Relational Learning of Opponent Models from Incomplete Information in No-Limit Poker , 2008, AAAI.