Policy-Gradients for PSRs and POMDPs
暂无分享,去创建一个
[1] Reid G. Simmons,et al. Probabilistic Robot Navigation in Partially Observable Environments , 1995, IJCAI.
[2] Richard S. Sutton,et al. Using Predictive Representations to Improve Generalization in Reinforcement Learning , 2005, IJCAI.
[3] Peter L. Bartlett,et al. Experiments with Infinite-Horizon, Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[4] Richard S. Sutton,et al. Predictive Representations of State , 2001, NIPS.
[5] Michael R. James,et al. Predictive State Representations: A New Theory for Modeling Dynamical Systems , 2004, UAI.
[6] Michael L. Littman,et al. Planning with predictive state representations , 2004, 2004 International Conference on Machine Learning and Applications, 2004. Proceedings..
[7] Michael H. Bowling,et al. Online Discovery and Learning of Predictive State Representations , 2005, NIPS.
[8] Lonnie Chrisman,et al. Reinforcement Learning with Perceptual Aliasing: The Perceptual Distinctions Approach , 1992, AAAI.
[9] Michael H. Bowling,et al. Learning predictive state representations using non-blind policies , 2006, ICML '06.
[10] Andrew McCallum,et al. Reinforcement learning with selective perception and hidden state , 1996 .
[11] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[12] Stefan Schaal,et al. Natural Actor-Critic , 2003, Neurocomputing.