Bandit view on noisy optimization
暂无分享,去创建一个
Rémi Munos | Sébastien Bubeck | Jean-Yves Audibert | R. Munos | Jean-Yves Audibert | Sébastien Bubeck
[1] J. Doob. Stochastic processes , 1953 .
[2] W. Hoeffding. Probability Inequalities for sums of Bounded Random Variables , 1963 .
[3] D. Freedman. On Tail Probabilities for Martingales , 1975 .
[4] H. Robbins,et al. Asymptotically efficient adaptive allocation rules , 1985 .
[5] David Williams,et al. Probability with Martingales , 1991, Cambridge mathematical textbooks.
[6] Andrew W. Moore,et al. Hoeffding Races: Accelerating Model Selection Search for Classification and Function Approximation , 1993, NIPS.
[7] A. Burnetas,et al. Optimal Adaptive Policies for Sequential Allocation Problems , 1996 .
[8] Richard M. Karp,et al. An Optimal Algorithm for Monte Carlo Estimation , 2000, SIAM J. Comput..
[9] Leslie Pack Kaelbling,et al. Sampling Methods for Action Selection in Influence Diagrams , 2000, AAAI/IAAI.
[10] Y. Freund,et al. The non-stochastic multi-armed bandit problem , 2001 .
[11] John N. Tsitsiklis,et al. The Sample Complexity of Exploration in the Multi-Armed Bandit Problem , 2004, J. Mach. Learn. Res..
[12] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[13] Osamu Watanabe,et al. Adaptive Sampling Methods for Scaling Up Knowledge Discovery Algorithms , 1999, Data Mining and Knowledge Discovery.
[14] Shie Mannor,et al. Action Elimination and Stopping Conditions for the Multi-Armed Bandit and Reinforcement Learning Problems , 2006, J. Mach. Learn. Res..
[15] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.
[16] Rémi Munos,et al. Bandit Algorithms for Tree Search , 2007, UAI.
[17] P. Massart,et al. Concentration inequalities and model selection , 2007 .
[18] H. Robbins. Some aspects of the sequential design of experiments , 1952 .
[19] Csaba Szepesvári,et al. Empirical Bernstein stopping , 2008, ICML '08.
[20] Csaba Szepesvári,et al. Online Optimization in X-Armed Bandits , 2008, NIPS.
[21] Jean-Yves Audibert,et al. Minimax Policies for Adversarial and Stochastic Bandits. , 2009, COLT 2009.
[22] Massimiliano Pontil,et al. Empirical Bernstein Bounds and Sample-Variance Penalization , 2009, COLT.
[23] Csaba Szepesvári,et al. Exploration-exploitation tradeoff using variance estimates in multi-armed bandits , 2009, Theor. Comput. Sci..
[24] Rémi Munos,et al. Pure Exploration in Multi-armed Bandits Problems , 2009, ALT.
[25] Sébastien Bubeck. Bandits Games and Clustering Foundations , 2010 .
[26] Jean-Yves Audibert. PAC-Bayesian aggregation and multi-armed bandits , 2010 .