Behavior hierarchy learning in a behavior-based system using reinforcement learning

Hand-design of an intelligent agent's behaviors and their hierarchy is a very hard task. One of the most important steps toward creating intelligent agents is providing them with capability to learn the required behaviors and their architecture. Architecture learning in a behavior-based agent with subsumption architecture is considered in this paper. Overall value function is decomposed into easily calculate-able parts in order to learn the behavior hierarchy. Using probabilistic formulations, two different decomposition methods are discussed: storing the estimated value of each behavior in each layer, and storing the ordering of behaviors in the architecture. Using defined decompositions, two appropriate credit assignment methods are designed. Finally, the proposed methods are tested in a multi-robot object-lifting task that results in satisfactory performance.

[1]  Rodney A. Brooks,et al.  Learning to Coordinate Behaviors , 1990, AAAI.

[2]  Michael I. Jordan,et al.  MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .

[3]  R A Brooks,et al.  New Approaches to Robotics , 1991, Science.

[4]  Maja J. Mataric,et al.  Reward Functions for Accelerated Learning , 1994, ICML.

[5]  Majid Nili Ahmadabadi,et al.  A "constrain and move" approach to distributed object manipulation , 2001, IEEE Trans. Robotics Autom..

[6]  M. N. Ahmadabadi,et al.  Experimental Analysis of Knowledge Based Multiagent Credit Assignment , 2004 .

[7]  Thomas G. Dietterich Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..

[8]  Doina Precup,et al.  Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[9]  Maja J. Mataric,et al.  Learning in behavior-based multi-robot systems: policies, models, and other agents , 2001, Cognitive Systems Research.

[10]  Sridhar Mahadevan,et al.  Scaling Reinforcement Learning to Robotics by Exploiting the Subsumption Architecture , 1991, ML.

[11]  Rodney A. Brooks,et al.  A Robust Layered Control Syste For A Mobile Robot , 2022 .

[12]  Geoffrey E. Hinton,et al.  Feudal Reinforcement Learning , 1992, NIPS.

[13]  Stuart J. Russell,et al.  Reinforcement Learning with Hierarchies of Machines , 1997, NIPS.

[14]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..