论文信息 - Planning with predictive state representations

Planning with predictive state representations

Predictive state representation (PSR) models for controlled dynamical systems have recently been proposed as an alternative to traditional models such as partially observable Markov decision processes (POMDPs). In this paper we develop and evaluate two general planning algorithms for PSR models. First, we show how planning algorithms for POMDPs that exploit the piecewise linear property of value functions for finite-horizon problems can be extended to PSRs. This requires an interesting replacement of the role of hidden nominalstates in POMDPs with linearly independent predictions in PSRs. Second, we show how traditional reinforcement learning algorithms such as Q-learning can be extended to PSR models. We empirically evaluate both our algorithms on a standard set of test POMDP problems.

Michael L. Littman | Michael R. James | Satinder P. Singh | Satinder Singh | M. Littman

[1] Michael L. Littman,et al. Incremental Pruning: A Simple, Fast, Exact Method for Partially Observable Markov Decision Processes , 1997, UAI.

[2] Peter Stone,et al. Learning Predictive State Representations , 2003, ICML.

[3] Leslie Pack Kaelbling,et al. Acting Optimally in Partially Observable Stochastic Domains , 1994, AAAI.

[4] Chris Watkins,et al. Learning from delayed rewards , 1989 .

[5] J. Albus. A Theory of Cerebellar Function , 1971 .

[6] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[7] Wenju Liu,et al. Planning in Stochastic Domains: Problem Characteristics and Approximation , 1996 .

[8] Michael L. Littman,et al. Algorithms for Sequential Decision Making , 1996 .

[9] Andrew G. Barto,et al. Improving Elevator Performance Using Reinforcement Learning , 1995, NIPS.

[10] Gerald Tesauro,et al. TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play , 1994, Neural Computation.

[11] Doina Precup,et al. A Planning Algorithm for Predictive State Representations , 2003, IJCAI.

[12] Richard S. Sutton,et al. Predictive Representations of State , 2001, NIPS.