Policy Gradients with Variance Related Risk Criteria
暂无分享,去创建一个
[1] V. Borkar. Stochastic approximation with two time scales , 1997 .
[2] Laurent El Ghaoui,et al. Robust Control of Markov Decision Processes with Uncertain Transition Matrices , 2005, Oper. Res..
[3] D. Krass,et al. Percentile performance criteria for limiting average Markov decision processes , 1995, IEEE Trans. Autom. Control..
[4] John N. Tsitsiklis,et al. Mean-Variance Optimization in Markov Decision Processes , 2011, ICML.
[5] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[6] M. J. Sobel. The variance of discounted Markov decision processes , 1982 .
[7] Jack L. Treynor,et al. MUTUAL FUND PERFORMANCE* , 2007 .
[8] John N. Tsitsiklis,et al. Simulation-based optimization of Markov reward processes , 1998, Proceedings of the 37th IEEE Conference on Decision and Control (Cat. No.98CH36171).
[9] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[10] E. J. Collins,et al. Convergent multiple-timescales reinforcement learning algorithms in normal form games , 2003 .
[11] Bingkun Bao,et al. Infinite-Horizon Policy-Gradient Estimation with Variable Discount Factor for Markov Decision Process , 2008, 2008 3rd International Conference on Innovative Computing Information and Control.
[12] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.
[13] R. Howard,et al. Risk-Sensitive Markov Decision Processes , 1972 .
[14] J. Cockcroft. Investment in Science , 1962, Nature.
[15] Fritz Wysotzki,et al. Risk-Sensitive Reinforcement Learning Applied to Control under Constraints , 2005, J. Artif. Intell. Res..
[16] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Vol. II , 1976 .
[17] Sean P. Meyn,et al. Risk-Sensitive Optimal Control for Markov Decision Processes with Monotone Cost , 2002, Math. Oper. Res..
[18] Shie Mannor,et al. Percentile Optimization for Markov Decision Processes with Parameter Uncertainty , 2010, Oper. Res..
[19] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[20] Jesse Hoey,et al. An analytic solution to discrete Bayesian reinforcement learning , 2006, ICML.
[21] Peter L. Bartlett,et al. Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning , 2001, J. Mach. Learn. Res..