暂无分享,去创建一个
[1] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[2] K. Schlag. ELEVEN - Tests needed for a Recommendation , 2006 .
[3] Olivier Teytaud,et al. Modification of UCT with Patterns in Monte-Carlo Go , 2006 .
[4] John N. Tsitsiklis,et al. The Sample Complexity of Exploration in the Multi-Armed Bandit Problem , 2004, J. Mach. Learn. Res..
[5] Luc Devroye,et al. Combinatorial methods in density estimation , 2001, Springer series in statistics.
[6] Russell Greiner,et al. The Budgeted Multi-armed Bandit Problem , 2004, COLT.
[7] H. Robbins. Some aspects of the sequential design of experiments , 1952 .
[8] W. Hoeffding. Probability Inequalities for sums of Bounded Random Variables , 1963 .
[9] Colin McDiarmid,et al. Surveys in Combinatorics, 1989: On the method of bounded differences , 1989 .
[10] R. Munos,et al. Best Arm Identification in Multi-Armed Bandits , 2010, COLT.
[11] Csaba Szepesvári,et al. Online Optimization in X-Armed Bandits , 2008, NIPS.
[12] Aleksandrs Slivkins,et al. Sharp dichotomies for regret minimization in metric spaces , 2009, SODA '10.
[13] Y. Freund,et al. The non-stochastic multi-armed bandit problem , 2001 .
[14] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.
[15] Csaba Szepesvári,et al. Tuning Bandit Algorithms in Stochastic Environments , 2007, ALT.
[16] T. L. Lai Andherbertrobbins. Asymptotically Efficient Adaptive Allocation Rules , 2022 .
[17] Csaba Szepesvári,et al. Exploration-exploitation tradeoff using variance estimates in multi-armed bandits , 2009, Theor. Comput. Sci..
[18] Rémi Munos,et al. Bandit Algorithms for Tree Search , 2007, UAI.
[19] Shie Mannor,et al. PAC Bounds for Multi-armed Bandit and Markov Decision Processes , 2002, COLT.
[20] Robert D. Kleinberg. Nearly Tight Bounds for the Continuum-Armed Bandit Problem , 2004, NIPS.