Online Dynamic Programming
暂无分享,去创建一个
[1] Baruch Awerbuch,et al. Online linear optimization and adaptive routing , 2008, J. Comput. Syst. Sci..
[2] Wouter M. Koolen,et al. Putting Bayes to sleep , 2012, NIPS.
[3] L. Bregman. The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming , 1967 .
[4] Mark Herbster,et al. Tracking the Best Expert , 1995, Machine-mediated learning.
[5] Zheng Wen,et al. Tight Regret Bounds for Stochastic Combinatorial Semi-Bandits , 2014, AISTATS.
[6] Patrick Jaillet,et al. Solving Combinatorial Games using Products, Projections and Lexicographically Optimal Bases , 2016, ArXiv.
[7] Manfred K. Warmuth,et al. Tracking a Small Set of Experts by Mixing Past Posteriors , 2003, J. Mach. Learn. Res..
[8] J. Loday,et al. THE MULTIPLE FACETS OF THE ASSOCIAHEDRON , 2005 .
[9] Gábor Lugosi,et al. Minimax Policies for Combinatorial Prediction Games , 2011, COLT.
[10] Ronald L. Rardin,et al. Polyhedral Characterization of Discrete Dynamic Programming , 1990, Oper. Res..
[11] Shuji Kijima,et al. Online Prediction under Submodular Constraints , 2012, ALT.
[12] Arun Rajkumar,et al. Online Decision-Making in General Combinatorial Spaces , 2014, NIPS.
[13] Yishay Mansour,et al. Online Markov Decision Processes , 2009, Math. Oper. Res..
[14] Manfred K. Warmuth,et al. Optimum Follow the Leader Algorithm , 2005, COLT.
[15] F. Deutsch. Dykstra’s Cyclic Projections Algorithm: The Rate of Convergence , 1995 .
[16] Manfred K. Warmuth,et al. Path Kernels and Multiplicative Updates , 2002, J. Mach. Learn. Res..
[17] András György,et al. Online Learning in Markov Decision Processes with Changing Cost Sequences , 2014, ICML.
[18] Manfred K. Warmuth,et al. Randomized Online PCA Algorithms with Regret Bounds that are Logarithmic in the Dimension , 2008 .
[19] Masayuki Takeda,et al. Online Linear Optimization over Permutations , 2011, ISAAC.
[20] V. Kaibel. Extended Formulations in Combinatorial Optimization , 2011, 1104.1023.
[21] Heinz H. Bauschke,et al. Legendre functions and the method of random Bregman projections , 1997 .
[22] Nicolò Cesa-Bianchi,et al. Combinatorial Bandits , 2012, COLT.
[23] Santosh S. Vempala,et al. Efficient algorithms for online decision problems , 2005, Journal of computer and system sciences (Print).
[24] Manfred K. Warmuth,et al. Learning Permutations with Exponential Weights , 2007, COLT.
[25] Manfred K. Warmuth,et al. The weighted majority algorithm , 1989, 30th Annual Symposium on Foundations of Computer Science.
[26] Philip A. Knight,et al. The Sinkhorn-Knopp Algorithm: Convergence and Applications , 2008, SIAM J. Matrix Anal. Appl..
[27] Nir Ailon. Improved Bounds for Online Learning Over the Permutahedron and Other Ranking Polytopes , 2014, AISTATS.
[28] Xin-She Yang,et al. Introduction to Algorithms , 2021, Nature-Inspired Optimization Algorithms.
[29] Magyar Tud. The On-Line Shortest Path Problem Under Partial Monitoring , 2007 .
[30] Mehryar Mohri,et al. Weighted Automata Algorithms , 2009 .
[31] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.
[32] Mehryar Mohri,et al. On-Line Learning Algorithms for Path Experts with Non-Additive Losses , 2015, COLT.
[33] S. V. N. Vishwanathan,et al. Online Learning of Combinatorial Objects via Extended Formulation , 2016, ALT.
[34] Thomas P. Hayes,et al. The Price of Bandit Information for Online Optimization , 2007, NIPS.
[35] Gábor Lugosi,et al. Mathematics of operations research , 1998 .