Linear Programming for Large-Scale Markov Decision Problems
暂无分享,去创建一个
Peter L. Bartlett | Yasin Abbasi-Yadkori | Alan Malek | P. Bartlett | Yasin Abbasi-Yadkori | Alan Malek
[1] A. S. Manne. Linear Programming and Sequential Decisions , 1960 .
[2] Ronald A. Howard,et al. Dynamic Programming and Markov Processes , 1960 .
[3] Vladimir Vapnik,et al. Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .
[4] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Vol. II , 1976 .
[5] P. Schweitzer,et al. Generalized polynomial approximations in Markovian decision processes , 1985 .
[6] M. F.,et al. Bibliography , 1985, Experimental Gerontology.
[7] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[8] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[9] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[10] T. Lai,et al. Self-Normalized Processes: Limit Theory and Statistical Applications , 2001 .
[11] Benjamin Van Roy,et al. Approximate Linear Programming for Average-Cost Dynamic Programming , 2002, NIPS.
[12] Benjamin Van Roy,et al. The Linear Programming Approach to Approximate Dynamic Programming , 2003, Oper. Res..
[13] Martin Zinkevich,et al. Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.
[14] Milos Hauskrecht,et al. Linear Program Approximations for Factored Continuous-State Markov Decision Processes , 2003, NIPS.
[15] Milos Hauskrecht,et al. Solving Factored MDPs with Continuous and Discrete Variables , 2004, UAI.
[16] Benjamin Van Roy,et al. On Constraint Sampling in the Linear Programming Approach to Approximate Dynamic Programming , 2004, Math. Oper. Res..
[17] Sean R Eddy,et al. What is dynamic programming? , 2004, Nature Biotechnology.
[18] Adam Tauman Kalai,et al. Online convex optimization in the bandit setting: gradient descent without a gradient , 2004, SODA '05.
[19] Giuseppe Carlo Calafiore,et al. Uncertain convex programs: randomized solutions and confidence levels , 2005, Math. Program..
[20] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[21] Benjamin Van Roy,et al. A Cost-Shaping Linear Program for Average-Cost Approximate Dynamic Programming with Performance Guarantees , 2006, Math. Oper. Res..
[22] R. Sutton,et al. A convergent O ( n ) algorithm for off-policy temporal-difference learning with linear function approximation , 2008, NIPS 2008.
[23] Marco C. Campi,et al. The Exact Feasibility of Randomized Solutions of Uncertain Convex Programs , 2008, SIAM J. Optim..
[24] Michael Bowling,et al. Dual Representations for Dynamic Programming , 2008 .
[25] Marek Petrik,et al. Constraint relaxation in approximate linear programs , 2009, ICML '09.
[26] Shalabh Bhatnagar,et al. Fast gradient-descent methods for temporal-difference learning with linear function approximation , 2009, ICML '09.
[27] Shalabh Bhatnagar,et al. Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation , 2009, NIPS.
[28] Shalabh Bhatnagar,et al. Toward Off-Policy Learning Control with Function Approximation , 2010, ICML.
[29] Csaba Szepesvari,et al. Online learning for linearly parametrized control problems , 2012 .
[30] Vivek F. Farias,et al. Approximate Dynamic Programming via a Smoothed Linear Program , 2009, Oper. Res..
[31] Michael H. Veatch,et al. Approximate Linear Programming for Average Cost MDPs , 2013, Math. Oper. Res..