Prediction with Limited Advice and Multiarmed Bandits with Paid Observations
暂无分享,去创建一个
Koby Crammer | Peter L. Bartlett | Yasin Abbasi-Yadkori | Yevgeny Seldin | P. Bartlett | K. Crammer | Yasin Abbasi-Yadkori | Yevgeny Seldin
[1] Dean P. Foster,et al. No Internal Regret via Neighborhood Watch , 2011, AISTATS.
[2] Sébastien Bubeck,et al. Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..
[3] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..
[4] Shie Mannor,et al. Decoupling Exploration and Exploitation in Multi-Armed Bandits , 2012, ICML.
[5] Gábor Lugosi,et al. Minimizing regret with label efficient prediction , 2004, IEEE Transactions on Information Theory.
[6] Shie Mannor,et al. From Bandits to Experts: On the Value of Side-Observations , 2011, NIPS.
[7] Noga Alon,et al. From Bandits to Experts: A Tale of Domination and Independence , 2013, NIPS.
[8] Claudio Gentile,et al. Adaptive and Self-Confident On-Line Learning Algorithms , 2000, J. Comput. Syst. Sci..
[9] Csaba Szepesvári,et al. Minimax Regret of Finite Partial-Monitoring Games in Stochastic Environments , 2011, COLT.
[10] Sébastien Bubeck. Bandits Games and Clustering Foundations , 2010 .
[11] Russell Greiner,et al. Online Learning with Costly Features and Labels , 2013, NIPS.
[12] Jean-Yves Audibert,et al. Regret Bounds and Minimax Policies under Partial Monitoring , 2010, J. Mach. Learn. Res..
[13] Gábor Lugosi,et al. Prediction, learning, and games , 2006 .
[14] András György,et al. The Combination of the Label Efficient and the Multi-Armed Bandit Problem in Adversarial Setting , 2006 .