论文信息 - Monte Carlo *-Minimax Search

Monte Carlo *-Minimax Search

This paper introduces Monte Carlo *-Minimax Search (MCMS), a Monte Carlo search algorithm for turned-based, stochastic, two-player, zero-sum games of perfect information. The algorithm is designed for the class of densely stochastic games; that is, games where one would rarely expect to sample the same successor state multiple times at any particular chance node. Our approach combines sparse sampling techniques from MDP planning with classic pruning techniques developed for adversarial expectimax planning. We compare and contrast our algorithm to the traditional *-Minimax approaches, as well as MCTS enhanced with the Double Progressive Widening, on four games: Pig, EinStein Wurfelt Nicht!, Can't Stop, and Ra. Our results show that MCMS can be competitive with enhanced MCTS variants in some domains, while consistently outperforming the equivalent classic approaches given the same amount of thinking time.

[1] Jos W. H. M. Uiterwijk,et al. CHANCEPROBCUT: Forward pruning in chance nodes , 2009, 2009 IEEE Symposium on Computational Intelligence and Games.

[2] Yishay Mansour,et al. A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes , 1999, Machine Learning.

[3] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.

[4] Tzung-Pei Hong,et al. The Computational Intelligence of MoGo Revealed in Taiwan's Computer Go Tournaments , 2009, IEEE Transactions on Computational Intelligence and AI in Games.

[5] Rémi Coulom,et al. Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search , 2006, Computers and Games.

[6] Bruce W. Ballard,et al. The *-Minimax Search Procedure for Trees Containing Chance Nodes , 1983, Artif. Intell..

[7] D. Michie. GAME-PLAYING AND GAME-LEARNING AUTOMATA , 1966 .

[8] Clifton G. M. Presser,et al. Optimal Play of the Dice Game Pig , 2004 .

[9] Donald E. Knuth,et al. The Solution for the Branching Factor of the Alpha-Beta Pruning Algorithm , 1981, ICALP.

[10] Sid Sackson. Can't stop , 1998 .

[11] James Glenn,et al. Optimizing Genetic Algorithm Parameters for a Stochastic Game , 2010, IJCCI.

[12] Richard J. Lorentz. An MCTS Program to Play EinStein Würfelt Nicht! , 2011, ACG.

[13] H. Jaap van den Herik,et al. Proceedings of the 12th international conference on Advances in Computer Games , 2006 .

[14] Cathleen Heyden,et al. IMPLEMENTING A COMPUTER PLAYER FOR CARCASSONNE , 2009 .

[15] R. M. Burstall,et al. Advances in programming and non-numerical computation , 1967, The Mathematical Gazette.

[16] Thomas J. Walsh,et al. Integrating Sample-Based Planning and Model-Based Reinforcement Learning , 2010, AAAI.

[17] Bart Selman,et al. Trade-Offs in Sampling-Based Adversarial Planning , 2011, ICAPS.

[18] Clyde P. Kruskal,et al. Retrograde Approximation Algorithms for Jeopardy Stochastic Games , 2008, J. Int. Comput. Games Assoc..

[19] H. Jaap van den Herik,et al. Progressive Strategies for Monte-Carlo Tree Search , 2008 .

[20] Jonathan Schaeffer,et al. Rediscovering *-Minimax Search , 2004, Computers and Games.

[21] Joel Veness,et al. Variance Reduction in Monte-Carlo Tree Search , 2011, NIPS.

[22] Todd W. Neller,et al. Pigtail: A Pig Addendum , 2005 .

[23] Nataliya Sokolovska,et al. Continuous Upper Confidence Trees , 2011, LION.

[24] W. Hoeffding. Probability Inequalities for sums of Bounded Random Variables , 1963 .

[25] Mark H. M. Winands,et al. Monte Carlo Tree Search in Lines of Action , 2010, IEEE Transactions on Computational Intelligence and AI in Games.

[26] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[27] Dana Nau,et al. Toward an analysis of forward pruning , 1993 .

[28] Simon M. Lucas,et al. A Survey of Monte Carlo Tree Search Methods , 2012, IEEE Transactions on Computational Intelligence and AI in Games.

[29] Michael C. Fu,et al. An Adaptive Sampling Algorithm for Solving Markov Decision Processes , 2005, Oper. Res..

[30] Rémi Coulom,et al. Computing "Elo Ratings" of Move Patterns in the Game of Go , 2007, J. Int. Comput. Games Assoc..