Stochastic Multi-Armed Bandits with Unrestricted Delay Distributions
暂无分享,去创建一个
Yishay Mansour | Tomer Koren | Shahar Segal | Tal Lancewicki | Y. Mansour | Tomer Koren | Tal Lancewicki | Shahar Segal
[1] Pooria Joulani,et al. Adapting to Delays and Data in Adversarial Multi-Armed Bandits , 2020, ICML.
[2] Csaba Szepesvari,et al. Bandit Algorithms , 2020 .
[3] Michal Valko,et al. Stochastic bandits with arm-dependent delays , 2020, ICML.
[4] Julian Zimmert,et al. An Optimal Algorithm for Adversarial Bandits with Arbitrary Delays , 2019, AISTATS.
[5] Nicolò Cesa-Bianchi,et al. Nonstochastic Multiarmed Bandits with Unrestricted Delays , 2019, NeurIPS.
[6] Renyuan Xu,et al. Learning in Generalized Linear Contextual Bandits with Stochastic Delays , 2019, NeurIPS.
[7] Xi Chen,et al. Online EXP3 Learning in Adversarial Bandits with Delayed Feedback , 2019, NeurIPS.
[8] Claudio Gentile,et al. Nonstochastic Bandits with Composite Anonymous Feedback , 2018, COLT.
[9] Csaba Szepesvári,et al. Bandits with Delayed, Aggregated Anonymous Feedback , 2017, ICML.
[10] Vianney Perchet,et al. Stochastic Bandit Models for Delayed Conversions , 2017, UAI.
[11] Claudio Gentile,et al. Delay and Cooperation in Nonstochastic Bandits , 2016, COLT.
[12] András György,et al. Online Learning under Delayed Feedback , 2013, ICML.
[13] Andreas Krause,et al. Parallelizing Exploration-Exploitation Tradeoffs with Gaussian Process Bandit Optimization , 2012, ICML.
[14] John Langford,et al. Efficient Optimal Learning for Contextual Bandits , 2011, UAI.
[15] Peter Auer,et al. UCB revisited: Improved regret bounds for the stochastic multi-armed bandit problem , 2010, Period. Math. Hung..
[16] Robert D. Kleinberg,et al. Regret bounds for sleeping experts and bandits , 2010, Machine Learning.
[17] Shie Mannor,et al. Action Elimination and Stopping Conditions for the Multi-Armed Bandit and Reinforcement Learning Problems , 2006, J. Mach. Learn. Res..
[18] Imre Csiszár,et al. Context tree estimation for not necessarily finite memory processes, via BIC and MDL , 2005, IEEE Transactions on Information Theory.
[19] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[20] E. Ordentlich,et al. On delayed prediction of individual sequences , 2002, Proceedings IEEE International Symposium on Information Theory,.
[21] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..