Bandit Based Monte-Carlo Planning
暂无分享,去创建一个
[1] Dana S. Nau,et al. An Analysis of Forward Pruning , 1994, AAAI.
[2] Gerald Tesauro,et al. On-line Policy Improvement using Monte-Carlo Search , 1996, NIPS.
[3] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..
[4] Jonathan Schaeffer,et al. The challenge of poker , 2002, Artif. Intell..
[5] Brian Sheppard,et al. World-championship-caliber Scrabble , 2002, Artif. Intell..
[6] Bruno Bouzy,et al. Monte-Carlo Go Developments , 2003, ACG.
[7] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[8] Frédérick Garcia,et al. On-Line Search for Solving Markov Decision Processes via Heuristic Sampling , 2004, ECAI.
[9] Yishay Mansour,et al. A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes , 1999, Machine Learning.
[10] Michael C. Fu,et al. An Adaptive Sampling Algorithm for Solving Markov Decision Processes , 2005, Oper. Res..
[11] Jonathan Schaeffer,et al. Monte Carlo Planning in RTS Games , 2005, CIG.
[12] T. L. Lai Andherbertrobbins. Asymptotically Efficient Adaptive Allocation Rules , 2022 .