Policy-gradient for robust planning
暂无分享,去创建一个
[1] Sylvie Thiébaux,et al. Prottle: A Probabilistic Temporal Planner , 2005, AAAI.
[2] A. Cassandra,et al. Exact and approximate algorithms for partially observable markov decision processes , 1998 .
[3] Håkan L. S. Younes,et al. Policy Generation for Continuous-time Stochastic Domains with Concurrency , 2004, ICAPS.
[4] Douglas Aberdeen,et al. Policy-Gradient Methods for Planning , 2005, NIPS.
[5] Peter L. Bartlett,et al. Experiments with Infinite-Horizon, Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[6] Lin Zhang,et al. Decision-Theoretic Military Operations Planning , 2004, ICAPS.
[7] Manuela M. Veloso,et al. Multiagent learning using a variable learning rate , 2002, Artif. Intell..
[8] Mausam,et al. Concurrent Probabilistic Temporal Planning , 2005, ICAPS.
[9] Olivier Buffet,et al. Robust Planning with (L)RTDP , 2005, IJCAI.
[10] Martin Zinkevich,et al. Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.
[11] Yishay Mansour,et al. Nash Convergence of Gradient Dynamics in General-Sum Games , 2000, UAI.
[12] Bernhard Nebel,et al. The FF Planning System: Fast Plan Generation Through Heuristic Search , 2011, J. Artif. Intell. Res..
[13] Laurent El Ghaoui,et al. Robustness in Markov Decision Problems with Uncertain Transition Matrices , 2003, NIPS.
[14] Leslie Pack Kaelbling,et al. Playing is believing: The role of beliefs in multi-agent learning , 2001, NIPS.
[15] Peter L. Bartlett,et al. Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..