Motivated Reinforcement Learning

The standard reinforcement learning view of the involvement of neuromodulatory systems in instrumental conditioning includes a rather straightforward conception of motivation as prediction of sum future reward. Competition between actions is based on the motivating characteristics of their consequent states in this sense. Substantial, careful, experiments reviewed in Dickinson & Balleine,12,13 into the neurobiology and psychology of motivation shows that this view is incomplete. In many cases, animals are faced with the choice not between many different actions at a given state, but rather whether a single response is worth executing at all. Evidence suggests that the motivational process underlying this choice has different psychological and neural properties from that underlying action choice. We describe and model these motivational systems, and consider the way they interact.

[1]  E. Tolman The determiners of behavior at a choice point. , 1938 .

[2]  W. Estes Discriminative conditioning. I. A discriminative property of conditioned anticipation. , 1943 .

[3]  J. Konorski Conditioned reflexes and neuron organization. , 1948 .

[4]  S. Ochs Integrative Activity of the Brain: An Interdisciplinary Approach , 1968 .

[5]  N. Mackintosh The psychology of animal learning , 1974 .

[6]  R. Rescorla,et al.  The effect of two ways of devaluing the unconditioned stimulus after first- and second-order appetitive conditioning. , 1975, Journal of experimental psychology. Animal behavior processes.

[7]  Christopher D. Adams Variations in the Sensitivity of Instrumental Responding to Reinforcer Devaluation , 1982 .

[8]  Richard S. Sutton,et al.  Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[9]  K. Berridge,et al.  Palatability Shift of a Salt-Associated Incentive during Sodium Depletion , 1989, The Quarterly journal of experimental psychology. B, Comparative and physiological psychology.

[10]  Peter Dayan,et al.  Improving Generalization for Temporal Difference Learning: The Successor Representation , 1993, Neural Computation.

[11]  B. Balleine,et al.  Motivational control of goal-directed action , 1994 .

[12]  Richard S. Sutton,et al.  TD Models: Modeling the World at a Mixture of Time Scales , 1995, ICML.

[13]  Ben J. A. Kröse,et al.  Learning from delayed rewards , 1995, Robotics Auton. Syst..

[14]  R. Boakes,et al.  Motivational control after extended instrumental training , 1995 .

[15]  C. Balkenius Natural intelligence in artificial creatures , 1995 .

[16]  Peter Dayan,et al.  Bee foraging in uncertain environments using predictive hebbian learning , 1995, Nature.

[17]  B. Balleine,et al.  Motivational control of heterogeneous instrumental chains. , 1995 .

[18]  P. Dayan,et al.  A framework for mesencephalic dopamine systems based on predictive Hebbian learning , 1996, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[19]  Peter Dayan,et al.  A Neural Substrate of Prediction and Reward , 1997, Science.

[20]  E. Spier From Reactive Behaviour to Adaptive Behaviour: Motivational Models for Behaviour in Animals and Robo , 1997 .

[21]  Andrew G. Barto,et al.  Reinforcement learning , 1998 .

[22]  B. Balleine,et al.  Goal-directed instrumental action: contingency and incentive learning and their cortical substrates , 1998, Neuropharmacology.

[23]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[24]  P. Holland,et al.  Amygdala circuitry in attentional and representational processes , 1999, Trends in Cognitive Sciences.

[25]  Jonathan D. Cohen,et al.  Cognition and control in schizophrenia: a computational model of dopamine and prefrontal function , 1999, Biological Psychiatry.

[26]  G. Schoenbaum,et al.  Neural Encoding in Orbitofrontal Cortex and Basolateral Amygdala during Olfactory Discrimination Learning , 1999, The Journal of Neuroscience.

[27]  K. Berridge Reward learning: Reinforcement, incentives, and expectations , 2000 .

[28]  B. Balleine,et al.  The Role of the Nucleus Accumbens in Instrumental Conditioning: Evidence of a Functional Dissociation between Accumbens Core and Shell , 2001, The Journal of Neuroscience.