论文信息 - Feature-based methods for large scale dynamic programming

Feature-based methods for large scale dynamic programming

We develop a methodological framework and present a few different ways in which dynamic programming and compact representations can be combined to solve large scale stochastic control problems. In particular, we develop algorithms that employ two types of feature-based compact representations; that is, representations that involve feature extraction and a relatively simple approximation architecture. We prove the convergence of these algorithms and provide bounds on the approximation error. As an example, one of these algorithms is used to generate a strategy for the game of Tetris. Furthermore, we provide a counter-example illustrating the difficulties of integrating compact representations with dynamic programming, which exemplifies the shortcomings of certain simple approaches.

John N. Tsitsiklis | Benjamin van Roy

[1] Michael I. Jordan,et al. MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .

[2] Peter Dayan,et al. Technical Note: Q-Learning , 2004, Machine Learning.

[3] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.

[4] Richard E. Korf,et al. Planning as Search: A Quantitative Approach , 1987, Artif. Intell..

[5] F. Girosi,et al. Networks for approximation and learning , 1990, Proc. IEEE.

[6] Dimitri P. Bertsekas,et al. Dynamic Programming: Deterministic and Stochastic Models , 1987 .

[7] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[8] Andrew G. Barto,et al. Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..

[9] G. Tesauro. Practical Issues in Temporal Difference Learning , 1992 .

[10] Thomas L. Morin,et al. COMPUTATIONAL ADVANCES IN DYNAMIC PROGRAMMING , 1978 .

[11] Philip E. Agre,et al. Computational Research on Interaction and Agency , 1995, Artif. Intell..