Speeding-up Reinforcement Learning with Multi-step Actions

In recent years hierarchical concepts of temporal abstraction have been integrated in the reinforcement learning framework to improve scalability. However, existing approaches are limited to domains where a decomposition into subtasks is known a priori. In this paper we propose the concept of explicitly selecting time scale related actions if no subgoalrelated abstract actions are available. This is realised with multistep actions on different time scales that are combined in one single action set. The special structure of the action set is exploited in the MSAQ-learning algorithm. By learning on different explicitly specified time scales simultaneously, a considerable improvement of learning speed can be achieved. This is demonstrated on two benchmark problems.