PAC Bandits with Risk Constraints
暂无分享,去创建一个
[1] Krishnendu Chatterjee,et al. Generalized Risk-Aversion in Stochastic Multi-Armed Bandits , 2014, ArXiv.
[2] Alessandro Lazaric,et al. Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence , 2012, NIPS.
[3] W. R. Thompson. On the Theory of Apportionment , 1935 .
[4] John N. Tsitsiklis,et al. The Sample Complexity of Exploration in the Multi-Armed Bandit Problem , 2004, J. Mach. Learn. Res..
[5] Odalric-Ambrym Maillard,et al. Robust Risk-Averse Stochastic Multi-armed Bandits , 2013, ALT.
[6] Shie Mannor,et al. Action Elimination and Stopping Conditions for the Multi-Armed Bandit and Reinforcement Learning Problems , 2006, J. Mach. Learn. Res..
[7] Matthew Malloy,et al. lil' UCB : An Optimal Exploration Algorithm for Multi-Armed Bandits , 2013, COLT.
[8] Ambuj Tewari,et al. PAC Subset Selection in Stochastic Multi-armed Bandits , 2012, ICML.
[9] Shivaram Kalyanakrishnan,et al. Information Complexity in Bandit Subset Selection , 2013, COLT.
[10] R. Rockafellar,et al. Optimization of conditional value-at risk , 2000 .
[11] W. R. Thompson. ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .
[12] Robert Nowak,et al. A KL-LUCB algorithm for Large-Scale Crowdsourcing , 2017, NIPS.
[13] H. Robbins. Some aspects of the sequential design of experiments , 1952 .
[14] Michèle Sebag,et al. Exploration vs Exploitation vs Safety: Risk-Aware Multi-Armed Bandits , 2013, ACML.
[15] Jia Yuan Yu,et al. Sample Complexity of Risk-Averse Bandit-Arm Selection , 2013, IJCAI.
[16] Qing Zhao,et al. Risk-Averse Multi-Armed Bandit Problems Under Mean-Variance Measure , 2016, IEEE Journal of Selected Topics in Signal Processing.
[17] Oren Somekh,et al. Almost Optimal Exploration in Multi-Armed Bandits , 2013, ICML.
[18] Philippe Artzner,et al. Coherent Measures of Risk , 1999 .
[19] Aurélien Garivier,et al. On the Complexity of Best-Arm Identification in Multi-Armed Bandit Models , 2014, J. Mach. Learn. Res..
[20] Stefano Ermon,et al. Best arm identification in multi-armed bandits with delayed feedback , 2018, AISTATS.