Piecewise-stationary bandit problems with side observations
暂无分享,去创建一个
[1] E. S. Page. CONTINUOUS INSPECTION SCHEMES , 1954 .
[2] A. Shiryaev. On Optimum Methods in Quickest Detection Problems , 1963 .
[3] G. Lorden. PROCEDURES FOR REACTING TO A CHANGE IN DISTRIBUTION , 1971 .
[4] M. Pollak. Optimal Detection of a Change in Distribution , 1985 .
[5] M. Pollak. Average Run Lengths of an Optimal Method of Detecting a Change in Distribution. , 1987 .
[6] Manfred K. Warmuth,et al. The Weighted Majority Algorithm , 1994, Inf. Comput..
[7] T. Lai. SEQUENTIAL ANALYSIS: SOME CLASSICAL PROBLEMS AND NEW CHALLENGES , 2001 .
[8] Peter Auer,et al. Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..
[9] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..
[10] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[11] Mark Herbster,et al. Tracking the Best Expert , 1995, Machine Learning.
[12] C. Fuh. Asymptotic operating characteristics of an optimal change point detection in hidden Markov models , 2004, math/0503682.
[13] Michèle Sebag,et al. Multi-armed Bandit, Dynamic Environments and Meta-Bandits , 2006 .
[14] Y. Mei. Sequential change-point detection when unknown parameters are present in the pre-change distribution , 2006, math/0605322.
[15] Gábor Lugosi,et al. Prediction, learning, and games , 2006 .
[16] Shie Mannor,et al. Action Elimination and Stopping Conditions for the Multi-Armed Bandit and Reinforcement Learning Problems , 2006, J. Mach. Learn. Res..
[17] Aurélien Garivier,et al. On Upper-Confidence Bound Policies for Non-Stationary Bandit Problems , 2008 .
[18] N. Akakpo. Detecting change-points in a discrete distribution via model selection , 2008, 0801.0970.
[19] T. L. Lai Andherbertrobbins. Asymptotically Efficient Adaptive Allocation Rules , 2022 .