论文信息 - Speeding-up Reinforcement Learning with Multi-step Actions

Speeding-up Reinforcement Learning with Multi-step Actions

In recent years hierarchical concepts of temporal abstraction have been integrated in the reinforcement learning framework to improve scalability. However, existing approaches are limited to domains where a decomposition into subtasks is known a priori. In this paper we propose the concept of explicitly selecting time scale related actions if no subgoalrelated abstract actions are available. This is realised with multistep actions on different time scales that are combined in one single action set. The special structure of the action set is exploited in the MSAQ-learning algorithm. By learning on different explicitly specified time scales simultaneously, a considerable improvement of learning speed can be achieved. This is demonstrated on two benchmark problems.

Martin A. Riedmiller | Ralf Schoknecht

[1] Stephan Pareigis,et al. Adaptive Choice of Grid and Time in Reinforcement Learning , 1997, NIPS.

[2] Ronald E. Parr,et al. Hierarchical control and learning for markov decision processes , 1998 .

[3] Richard S. Sutton,et al. Roles of Macro-Actions in Accelerating Reinforcement Learning , 1998 .

[4] Doina Precup,et al. Using Options for Knowledge Transfer in Reinforcement Learning , 1999 .

[5] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..

[6] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..