Committing Bandits
暂无分享,去创建一个
[1] Peter Auer,et al. UCB revisited: Improved regret bounds for the stochastic multi-armed bandit problem , 2010, Period. Math. Hung..
[2] Rémi Munos,et al. Pure exploration in finitely-armed and continuous-armed bandits , 2011, Theor. Comput. Sci..
[3] Shie Mannor,et al. Action Elimination and Stopping Conditions for the Multi-Armed Bandit and Reinforcement Learning Problems , 2006, J. Mach. Learn. Res..
[4] John N. Tsitsiklis,et al. Linearly Parameterized Bandits , 2008, Math. Oper. Res..
[5] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.
[6] Rémi Munos,et al. Bandit Algorithms for Tree Search , 2007, UAI.
[7] Shie Mannor,et al. k-Armed Bandit , 2010, Encyclopedia of Machine Learning.
[8] Wei Chu,et al. A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.
[9] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[10] R. Agrawal. The Continuum-Armed Bandit Problem , 1995 .
[11] J. Langford,et al. The Epoch-Greedy algorithm for contextual multi-armed bandits , 2007, NIPS 2007.
[12] John N. Tsitsiklis,et al. The Sample Complexity of Exploration in the Multi-Armed Bandit Problem , 2004, J. Mach. Learn. Res..
[13] Eli Upfal,et al. Multi-Armed Bandits in Metric Spaces ∗ , 2008 .
[14] T. L. Lai Andherbertrobbins. Asymptotically Efficient Adaptive Allocation Rules , 2022 .