OSOM: A Simultaneously Optimal Algorithm for Multi-Armed and Linear Contextual Bandits
暂无分享,去创建一个
[1] T. L. Lai Andherbertrobbins. Asymptotically Efficient Adaptive Allocation Rules , 2022 .
[2] Karthik Sridharan,et al. BISTRO: An Efficient Relaxation-Based Method for Contextual Bandits , 2016, ICML.
[3] Vianney Perchet,et al. Anytime optimal algorithms in stochastic multi-armed bandits , 2016, ICML.
[4] John Langford,et al. Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits , 2014, ICML.
[5] Haipeng Luo,et al. Improved Regret Bounds for Oracle-Based Adversarial Contextual Bandits , 2016, NIPS.
[6] J. Tropp. FREEDMAN'S INEQUALITY FOR MATRIX MARTINGALES , 2011, 1101.3039.
[7] Ambuj Tewari,et al. From Ads to Interventions: Contextual Bandits in Mobile Health , 2017, Mobile Health - Sensors, Analytic Methods, and Applications.
[8] Aleksandrs Slivkins,et al. Contextual Bandits with Similarity Information , 2009, COLT.
[9] L. Ralaivola,et al. Empirical Bernstein Inequality for Martingales : Application to Online Learning , 2013 .
[10] John Langford,et al. Contextual Bandit Algorithms with Supervised Learning Guarantees , 2010, AISTATS.
[11] R. Oliveira. Concentration of the adjacency matrix and of the Laplacian in random graphs with independent edges , 2009, 0911.0600.
[12] John Langford,et al. Contextual Bandits with Continuous Actions: Smoothing, Zooming, and Adapting , 2019, COLT.
[13] Haipeng Luo,et al. Model selection for contextual bandits , 2019, NeurIPS.
[14] Aleksandrs Slivkins,et al. 25th Annual Conference on Learning Theory The Best of Both Worlds: Stochastic and Adversarial Bandits , 2022 .
[15] M. Woodroofe. A One-Armed Bandit Problem with a Concomitant Variable , 1979 .
[16] Zhiwei Steven Wu,et al. The Externalities of Exploration and How Data Diversity Helps Exploitation , 2018, COLT.
[17] Csaba Szepesvári,et al. Improved Algorithms for Linear Stochastic Bandits , 2011, NIPS.
[18] Khashayar Khosravi,et al. Mostly Exploration-Free Algorithms for Contextual Bandits , 2017, Manag. Sci..
[19] Sampath Kannan,et al. A Smoothed Analysis of the Greedy Algorithm for the Linear Contextual Bandit Problem , 2018, NeurIPS.
[20] Martin J. Wainwright,et al. High-Dimensional Statistics , 2019 .
[21] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[22] Akshay Krishnamurthy,et al. Contextual bandits with surrogate losses: Margin bounds and efficient algorithms , 2018, NeurIPS.
[23] Wei Chu,et al. A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.
[24] Matthew J. Streeter,et al. Tighter Bounds for Multi-Armed Bandits with Expert Advice , 2009, COLT.
[25] John Langford,et al. The Epoch-Greedy Algorithm for Multi-armed Bandits with Side Information , 2007, NIPS.
[26] Wei Chu,et al. Contextual Bandits with Linear Payoff Functions , 2011, AISTATS.
[27] Akshay Krishnamurthy,et al. Efficient Algorithms for Adversarial Contextual Learning , 2016, ICML.
[28] Éva Tardos,et al. Learning in Games: Robustness of Fast Convergence , 2016, NIPS.
[29] Haipeng Luo,et al. Corralling a Band of Bandit Algorithms , 2016, COLT.
[30] Roman Vershynin,et al. High-Dimensional Probability , 2018 .
[31] Jean-Yves Audibert,et al. Regret Bounds and Minimax Policies under Partial Monitoring , 2010, J. Mach. Learn. Res..
[32] John Langford,et al. Making Contextual Decisions with Low Technical Debt , 2016 .