Online Learning in Adversarial Lipschitz Environments
暂无分享,去创建一个
[1] Adam Tauman Kalai,et al. Online convex optimization in the bandit setting: gradient descent without a gradient , 2004, SODA '05.
[2] Shai Shalev-Shwartz,et al. Online learning: theory, algorithms and applications (למידה מקוונת.) , 2007 .
[3] Elad Hazan,et al. Competing in the Dark: An Efficient Algorithm for Bandit Linear Optimization , 2008, COLT.
[4] Peter L. Bartlett,et al. Adaptive Online Gradient Descent , 2007, NIPS.
[5] Martin Zinkevich,et al. Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.
[6] Thomas P. Hayes,et al. The Price of Bandit Information for Online Optimization , 2007, NIPS.
[7] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..
[8] P. Bartlett,et al. Optimal strategies and minimax lower bounds for online convex games [Technical Report No. UCB/EECS-2008-19] , 2008 .
[9] Nando de Freitas,et al. An Introduction to MCMC for Machine Learning , 2004, Machine Learning.
[10] Elad Hazan,et al. Logarithmic regret algorithms for online convex optimization , 2006, Machine Learning.
[11] Peter Green,et al. Markov chain Monte Carlo in Practice , 1996 .
[12] 中澤 真,et al. Devroye, L., Gyorfi, L. and Lugosi, G. : A Probabilistic Theory of Pattern Recognition, Springer (1996). , 1997 .
[13] Manfred K. Warmuth,et al. The weighted majority algorithm , 1989, 30th Annual Symposium on Foundations of Computer Science.
[14] David Haussler,et al. How to use expert advice , 1993, STOC.
[15] Peter Auer,et al. Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..
[16] Ambuj Tewari,et al. Efficient bandit algorithms for online multiclass prediction , 2008, ICML '08.
[17] Csaba Szepesvári,et al. Online Optimization in X-Armed Bandits , 2008, NIPS.
[18] Jan Poland,et al. Nonstochastic bandits: Countable decision set, unbounded costs and reactive environments , 2008, Theor. Comput. Sci..
[19] P. Moral. Feynman-Kac Formulae: Genealogical and Interacting Particle Systems with Applications , 2004 .
[20] R. Douc,et al. Minimum variance importance sampling via Population Monte Carlo , 2007 .
[21] Gábor Lugosi,et al. Prediction, learning, and games , 2006 .
[22] Elizabeth L. Wilmer,et al. Markov Chains and Mixing Times , 2008 .
[23] Claudio Gentile,et al. Adaptive and Self-Confident On-Line Learning Algorithms , 2000, J. Comput. Syst. Sci..
[24] Christopher M. Bishop,et al. Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .
[25] Eli Upfal,et al. Multi-Armed Bandits in Metric Spaces ∗ , 2008 .
[26] Nicolò Cesa-Bianchi,et al. Combinatorial Bandits , 2012, COLT.
[27] Nicolò Cesa-Bianchi,et al. Gambling in a rigged casino: The adversarial multi-armed bandit problem , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.
[28] László Györfi,et al. A Probabilistic Theory of Pattern Recognition , 1996, Stochastic Modelling and Applied Probability.
[29] Gilles Stoltz. Incomplete information and internal regret in prediction of individual sequences , 2005 .
[30] Thomas P. Hayes,et al. Stochastic Linear Optimization under Bandit Feedback , 2008, COLT.