Spectral Thompson Sampling
暂无分享,去创建一个
Rémi Munos | Michal Valko | Shipra Agrawal | Tomás Kocák | R. Munos | Michal Valko | Shipra Agrawal | Tomás Kocák
[1] Rémi Munos,et al. Spectral Bandits for Smooth Graph Functions , 2014, ICML.
[2] Shipra Agrawal,et al. Thompson Sampling for Contextual Bandits with Linear Payoffs , 2012, ICML.
[3] Alfred Kobsa,et al. The Adaptive Web, Methods and Strategies of Web Personalization , 2007, The Adaptive Web.
[4] Lihong Li,et al. An Empirical Evaluation of Thompson Sampling , 2011, NIPS.
[5] Rémi Munos,et al. Thompson Sampling for 1-Dimensional Exponential Family Bandits , 2013, NIPS.
[6] Shipra Agrawal,et al. Analysis of Thompson Sampling for the Multi-armed Bandit Problem , 2011, COLT.
[7] Rémi Munos,et al. Thompson Sampling: An Asymptotically Optimal Finite-Time Analysis , 2012, ALT.
[8] Gary L. Miller,et al. Approaching Optimality for Solving SDD Linear Systems , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.
[9] David S. Leslie,et al. Optimistic Bayesian Sampling in Contextual-Bandit Problems , 2012, J. Mach. Learn. Res..
[10] Benjamin Van Roy,et al. Eluder Dimension and the Sample Complexity of Optimistic Exploration , 2013, NIPS.
[11] Thomas P. Hayes,et al. The Price of Bandit Information for Online Optimization , 2007, NIPS.
[12] Peter Auer,et al. Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..
[13] Yisong Yue,et al. Hierarchical Exploration for Accelerating Contextual Bandits , 2012, ICML.
[14] Andreas Krause,et al. Information-Theoretic Regret Bounds for Gaussian Process Optimization in the Bandit Setting , 2009, IEEE Transactions on Information Theory.
[15] Wei Chu,et al. A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.
[16] Albert,et al. Emergence of scaling in random networks , 1999, Science.
[17] Benjamin Van Roy,et al. Learning to Optimize via Posterior Sampling , 2013, Math. Oper. Res..
[18] Csaba Szepesvári,et al. Improved Algorithms for Linear Stochastic Bandits , 2011, NIPS.
[19] Shipra Agrawal,et al. Further Optimal Regret Bounds for Thompson Sampling , 2012, AISTATS.
[20] Michael J. Pazzani,et al. Content-Based Recommendation Systems , 2007, The Adaptive Web.
[21] Mikhail Belkin,et al. Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..
[22] W. R. Thompson. ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .
[23] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.