A General Framework for Bandit Problems Beyond Cumulative Objectives.
暂无分享,去创建一个
Shie Mannor | Asaf Cassel | Technion | Columbia University | Assaf Zeevi School of Computer Science | Tel Aviv University | Faculty of Electrical Engineering | Israel Institute of Technology | Graudate School of Business
[1] Warren B. Powell,et al. Risk-Averse Approximate Dynamic Programming with Quantile-Based Risk Measures , 2015, Math. Oper. Res..
[2] Alessandro Lazaric,et al. Risk-Aversion in Multi-armed Bandits , 2012, NIPS.
[3] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[4] Qing Zhao,et al. Mean-variance and value at risk in multi-armed bandit problems , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).
[5] Qing Zhao,et al. Risk-Averse Multi-Armed Bandit Problems Under Mean-Variance Measure , 2016, IEEE Journal of Selected Topics in Signal Processing.
[6] Shie Mannor,et al. PAC Bandits with Risk Constraints , 2018, ISAIM.
[7] Shipra Agrawal,et al. Analysis of Thompson Sampling for the Multi-armed Bandit Problem , 2011, COLT.
[8] Sanjay P. Bhat,et al. Concentration of risk measures: A Wasserstein distance approach , 2019, NeurIPS.
[9] Rémi Munos,et al. A Finite-Time Analysis of Multi-armed Bandits Problems with Kullback-Leibler Divergences , 2011, COLT.
[10] Evan Fisher. On the Law of the Iterated Logarithm for Martingales , 1992 .
[11] Nicolò Cesa-Bianchi,et al. Gambling in a rigged casino: The adversarial multi-armed bandit problem , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.
[12] Shie Mannor,et al. A General Approach to Multi-Armed Bandits Under Risk Criteria , 2018, COLT.
[13] H. Robbins,et al. Asymptotically efficient adaptive allocation rules , 1985 .
[14] P. Massart. The Tight Constant in the Dvoretzky-Kiefer-Wolfowitz Inequality , 1990 .
[15] Krishna P. Jagannathan,et al. Concentration bounds for CVaR estimation: The cases of light-tailed and heavy-tailed distributions , 2019, ICML.
[16] Krishnendu Chatterjee,et al. Generalized Risk-Aversion in Stochastic Multi-Armed Bandits , 2014, ArXiv.
[17] Sébastien Bubeck,et al. Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..
[18] Odalric-Ambrym Maillard,et al. Robust Risk-Averse Stochastic Multi-armed Bandits , 2013, ALT.
[19] Michèle Sebag,et al. Exploration vs Exploitation vs Safety: Risk-Aware Multi-Armed Bandits , 2013, ACML.
[20] Krishna Jagannathan,et al. Distribution oblivious, risk-aware algorithms for multi-armed bandits with unbounded rewards , 2019, NeurIPS.
[21] Philippe Artzner,et al. Coherent Measures of Risk , 1999 .
[22] Nikhil R. Devanur,et al. Bandits with Global Convex Constraints and Objective , 2019, Oper. Res..
[23] Michael R. Lyu,et al. Risk Control of Best Arm Identification in Multi-armed Bandits via Successive Rejects , 2017, 2017 IEEE International Conference on Data Mining (ICDM).
[24] H. Robbins. Some aspects of the sequential design of experiments , 1952 .