Policy-Gradient Methods for Planning
暂无分享,去创建一个
[1] Mausam,et al. Concurrent Probabilistic Temporal Planning , 2005, ICAPS.
[2] Lex Weaver,et al. A Multi-Agent Policy-Gradient Approach to Network Routing , 2001, ICML.
[3] Lin Zhang,et al. Decision-Theoretic Military Operations Planning , 2004, ICAPS.
[4] Peter L. Bartlett,et al. Experiments with Infinite-Horizon, Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[5] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[6] Kee-Eung Kim,et al. Learning to Cooperate via Policy Search , 2000, UAI.
[7] Håkan L. S. Younes,et al. Policy Generation for Continuous-time Stochastic Domains with Concurrency , 2004, ICAPS.
[8] Sylvie Thiébaux,et al. Prottle: A Probabilistic Temporal Planner , 2005, AAAI.
[9] Andrew G. Barto,et al. Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..