Pure exploration in finitely-armed and continuous-armed bandits
暂无分享,去创建一个
[1] Sonja Kuhnt,et al. Design and analysis of computer experiments , 2010 .
[2] R. Munos,et al. Best Arm Identification in Multi-Armed Bandits , 2010, COLT.
[3] Aleksandrs Slivkins,et al. Sharp dichotomies for regret minimization in metric spaces , 2009, SODA '10.
[4] Csaba Szepesvári,et al. Exploration-exploitation tradeoff using variance estimates in multi-armed bandits , 2009, Theor. Comput. Sci..
[5] Csaba Szepesvári,et al. Online Optimization in X-Armed Bandits , 2008, NIPS.
[6] A. Tsybakov,et al. Gap-free Bounds for Stochastic Multi-Armed Bandit , 2008 .
[7] Rémi Munos,et al. Bandit Algorithms for Tree Search , 2007, UAI.
[8] András György,et al. Continuous Time Associative Bandit Problems , 2007, IJCAI.
[9] Peter Auer,et al. Hannan Consistency in On-Line Learning in Case of Unbounded Losses Under Partial Monitoring , 2006, ALT.
[10] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.
[11] K. Schlag. ELEVEN - Tests needed for a Recommendation , 2006 .
[12] Olivier Teytaud,et al. Modification of UCT with Patterns in Monte-Carlo Go , 2006 .
[13] Gilles Stoltz. Incomplete information and internal regret in prediction of individual sequences , 2005 .
[14] Gábor Lugosi,et al. Minimizing regret with label efficient prediction , 2004, IEEE Transactions on Information Theory.
[15] Robert D. Kleinberg. Nearly Tight Bounds for the Continuum-Armed Bandit Problem , 2004, NIPS.
[16] Russell Greiner,et al. The Budgeted Multi-armed Bandit Problem , 2004, COLT.
[17] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[18] John N. Tsitsiklis,et al. The Sample Complexity of Exploration in the Multi-Armed Bandit Problem , 2004, J. Mach. Learn. Res..
[19] Shie Mannor,et al. PAC Bounds for Multi-armed Bandit and Markov Decision Processes , 2002, COLT.
[20] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..
[21] Luc Devroye,et al. Combinatorial methods in density estimation , 2001, Springer series in statistics.
[22] Y. Freund,et al. The non-stochastic multi-armed bandit problem , 2001 .
[23] Chun-Hung Chen,et al. Simulation Budget Allocation for Further Enhancing the Efficiency of Ordinal Optimization , 2000, Discret. Event Dyn. Syst..
[24] Colin McDiarmid,et al. Surveys in Combinatorics, 1989: On the method of bounded differences , 1989 .
[25] H. Robbins,et al. Asymptotically efficient adaptive allocation rules , 1985 .
[26] P. Billingsley,et al. Convergence of Probability Measures , 1970, The Mathematical Gazette.
[27] W. Hoeffding. Probability inequalities for sum of bounded random variables , 1963 .
[28] H. Robbins. Some aspects of the sequential design of experiments , 1952 .