Intrinsically Motivated Decision Making for Situated, Goal-Driven Agents

Goal-driven agents are generally expected to be capable of pursuing simultaneously a variety of goals. As these goals may compete in certain circumstances, the agent must be able to constantly trade them off and shift their priorities in a rational way. One aspect of rationality is to evaluate its needs and make decisions accordingly. We endow the agent with a set of needs, or drives, that change over time as a function of external stimuli and internal consumption, and the decision making process hast to generate actions that maintain balance between these needs. The proposed framework pursues an approach in which decision making is considered as a multiobjective problem and approximately solved using a hierarchical reinforcement learning architecture. At a higher-level, a Q-learning learns to select the best learning strategy that improves the well-being of the agent. At a lower-level, an actor-critic design executes the selected strategy while interacting with a continuous, partially observable environment. We provide simulation results to demonstrate the efficiency of the approach.

[1]  Kalyanmoy Deb,et al.  Multi-objective Genetic Algorithms: Problem Difficulties and Construction of Test Problems , 1999, Evolutionary Computation.

[2]  María Malfaz,et al.  A New Approach to Modeling Emotions and Their Use on a Decision-Making System for Artificial Agents , 2012, IEEE Transactions on Affective Computing.

[3]  Dongkyu Choi,et al.  Reactive goal management in a cognitive architecture , 2011, Cognitive Systems Research.

[4]  Andrew G. Barto,et al.  An Adaptive Robot Motivational System , 2006, SAB.

[5]  Ryszard S. Michalski,et al.  Inferential Theory of Learning: Developing Foundations for Multistrategy Learning , 1992 .

[6]  Shlomo Zilberstein Metareasoning and Bounded Rationality , 2011, Metareasoning.

[7]  Lola Cañamero,et al.  Hedonic value: enhancing adaptation for motivated agents , 2013, Adapt. Behav..

[8]  Célia da Costa Pereira,et al.  A Possibilistic Approach to Goal Generation in Cognitive Agents , 2010, IEA/AIE.

[9]  Bernard W. Balleine,et al.  Actions, Action Sequences and Habits: Evidence That Goal-Directed and Habitual Action Control Are Hierarchically Organized , 2013, PLoS Comput. Biol..

[10]  David W. Aha,et al.  Integrated Learning for Goal-Driven Autonomy , 2011, IJCAI.

[11]  R. Bellman Dynamic programming. , 1957, Science.

[12]  David B. Leake,et al.  Goal-driven learning , 1995 .

[13]  Petia Koprinkova-Hristova,et al.  Learning of embodied interaction dynamics with recurrent neural networks: some exploratory experiments , 2014, Journal of neural engineering.

[14]  John Hallam,et al.  From Animals to Animats 10 , 2008 .

[15]  Meritxell Vinyals,et al.  Generalizing DPOP: Action-GDL, a new complete algorithm for DCOPs , 2009, AAMAS.

[16]  Martin V. Butz,et al.  Self-Organizing Sensorimotor Maps Plus Internal Motivations Yield Animal-Like Behavior , 2010, Adapt. Behav..

[17]  George Dimitri Konidaris,et al.  An Architecture for Behavior-Based Reinforcement Learning , 2005, Adapt. Behav..

[18]  Doina Precup,et al.  Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[19]  Peter Dayan,et al.  Goal-directed control and its antipodes , 2009, Neural Networks.