论文信息 - Mistake bounds on the noise-free multi-armed bandit game

Mistake bounds on the noise-free multi-armed bandit game

Abstract We study the { 0 , 1 } -loss version of adaptive adversarial multi-armed bandit problems with α ( ≥ 1 ) lossless arms. For the problem, we show a tight bound K − α − Θ ( 1 / T ) on the minimax expected number of mistakes (1-losses), where K is the number of arms and T is the number of rounds.

Atsuyoshi Nakamura | Manfred K. Warmuth | David P. Helmbold | D. Helmbold | Atsuyoshi Nakamura

[1] F. R. Rosendaal,et al. Prediction , 2015, Journal of thrombosis and haemostasis : JTH.

[2] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..

[3] Atsuyoshi Nakamura,et al. Noise Free Multi-armed Bandit Game , 2015, LATA.

[4] Jean-Yves Audibert,et al. Minimax Policies for Adversarial and Stochastic Bandits. , 2009, COLT 2009.

[5] Gábor Lugosi,et al. Prediction, learning, and games , 2006 .

[6] Sébastien Bubeck,et al. Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..