Soft-Robust Actor-Critic Policy-Gradient
暂无分享,去创建一个
Shie Mannor | Timothy A. Mann | Daniel J. Mankowitz | Esther Derman | Shie Mannor | D. Mankowitz | E. Derman
[1] Carlos S. Kubrusly,et al. Stochastic approximation algorithms and applications , 1973, CDC 1973.
[2] Harold J. Kushner,et al. wchastic. approximation methods for constrained and unconstrained systems , 1978 .
[3] P. H. Friesen,et al. Innovation in Conservative and Entrepreneurial Firms: Two Models of Strategic Momentum , 1982 .
[4] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[5] V. Borkar. Stochastic approximation with two time scales , 1997 .
[6] John N. Tsitsiklis,et al. Actor-Critic Algorithms , 1999, NIPS.
[7] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[8] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[9] Laurent El Ghaoui,et al. Robust Control of Markov Decision Processes with Uncertain Transition Matrices , 2005, Oper. Res..
[10] Garud Iyengar,et al. Robust Dynamic Programming , 2005, Math. Oper. Res..
[11] John N. Tsitsiklis,et al. Bias and Variance Approximation in Value Function Estimates , 2007, Manag. Sci..
[12] Mohammad Ghavamzadeh,et al. Bayesian actor-critic algorithms , 2007, ICML '07.
[13] Shalabh Bhatnagar,et al. Natural actor-critic algorithms , 2009, Autom..
[14] Shie Mannor,et al. Distributionally Robust Markov Decision Processes , 2010, Math. Oper. Res..
[15] Shie Mannor,et al. Lightning Does Not Strike Twice: Robust MDPs with Coupled Uncertainty , 2012, ICML.
[16] Robert Babuska,et al. A Survey of Actor-Critic Reinforcement Learning: Standard and Natural Policy Gradients , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
[17] Peter Vrancx,et al. Reinforcement Learning: State-of-the-Art , 2012 .
[18] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[19] O. Mitchell,et al. The market for retirement financial advice , 2013 .
[20] Shie Mannor,et al. Scaling Up Robust MDPs using Function Approximation , 2014, ICML.
[21] Guy Lever,et al. Deterministic Policy Gradient Algorithms , 2014, ICML.
[22] Shie Mannor,et al. Bayesian Reinforcement Learning: A Survey , 2015, Found. Trends Mach. Learn..
[23] Shie Mannor,et al. Policy Gradient for Coherent Risk Measures , 2015, NIPS.
[24] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[25] Michal Valko,et al. Bayesian Policy Gradient and Actor-Critic Algorithms , 2016, J. Mach. Learn. Res..
[26] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[27] Shie Mannor,et al. Robust MDPs with k-Rectangular Uncertainty , 2016, Math. Oper. Res..
[28] Shie Mannor,et al. Reinforcement Learning in Robust Markov Decision Processes , 2013, Math. Oper. Res..
[29] Huan Xu,et al. Distributionally Robust Counterpart in Markov Decision Processes , 2015, IEEE Transactions on Automatic Control.
[30] Shie Mannor,et al. Deep Robust Kalman Filter , 2017, ArXiv.
[31] Aurko Roy,et al. Reinforcement Learning under Model Mismatch , 2017, NIPS.
[32] Shie Mannor,et al. Learning Robust Options , 2018, AAAI.