Choice modulates the neural dynamics of prediction error processing during rewarded learning

Our ability to selectively engage with our environment enables us to guide our learning and to take advantage of its benefits. When facing multiple possible actions, our choices are a critical aspect of learning. In the case of learning from rewarding feedback, there has been substantial theoretical and empirical progress in elucidating the associated behavioral and neural processes, predominantly in terms of a reward prediction error, a measure of the discrepancy between actual versus expected reward. Nevertheless, the distinct influence of choice on prediction error processing and its neural dynamics remains relatively unexplored. In this study we used a novel paradigm to determine how choice influences prediction error processing and to examine whether there are correspondingly distinct neural dynamics. We recorded scalp electroencephalogram while healthy adults were administered a rewarded learning task in which choice trials were intermingled with control trials involving the same stimuli, motor responses, and probabilistic rewards. We used a temporal difference learning model of subjects' trial-by-trial choices to infer subjects' image valuations and corresponding prediction errors. As expected, choices were associated with lower overall prediction error magnitudes, most notably over the course of learning the stimulus-reward contingencies. Choices also induced a higher-amplitude relative positivity in the frontocentral event-related potential about 200 ms after reward signal onset that was negatively correlated with the differential effect of choice on the prediction error. Thus choice influences the neural dynamics associated with how reward signals are processed during learning. Behavioral, computational, and neurobiological models of rewarded learning should therefore accommodate a distinct influence for choice during rewarded learning.

[1]  Colin Camerer,et al.  Self-control in decision-making involves modulation of the vmPFC valuation system , 2009, NeuroImage.

[2]  T. Mexia,et al.  Author ' s personal copy , 2009 .

[3]  Clay B. Holroyd,et al.  Dorsal anterior cingulate cortex integrates reinforcement history to guide voluntary behavior , 2008, Cortex.

[4]  Arnaud Delorme,et al.  EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics including independent component analysis , 2004, Journal of Neuroscience Methods.

[5]  C. Braun,et al.  Event-Related Brain Potentials Following Incorrect Feedback in a Time-Estimation Task: Evidence for a Generic Neural System for Error Detection , 1997, Journal of Cognitive Neuroscience.

[6]  Henrik Walter,et al.  Prediction error as a linear function of reward probability is coded in human nucleus accumbens , 2006, NeuroImage.

[7]  Clay B. Holroyd,et al.  Dorsal anterior cingulate cortex shows fMRI response to internal and external error signals , 2004, Nature Neuroscience.

[8]  E. Vaadia,et al.  Midbrain dopamine neurons encode decisions for future action , 2006, Nature Neuroscience.

[9]  E. Procyk,et al.  Reward encoding in the monkey anterior cingulate cortex. , 2006, Cerebral cortex.

[10]  P. Dayan,et al.  A framework for mesencephalic dopamine systems based on predictive Hebbian learning , 1996, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[11]  Clay B. Holroyd,et al.  ERP correlates of feedback and reward processing in the presence and absence of response choice. , 2005, Cerebral cortex.

[12]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[13]  T L Babb,et al.  Activity of human hippocampal formation and amygdala neurons during memory testing. , 1978, Electroencephalography and clinical neurophysiology.

[14]  K L Leenders,et al.  Reduced reward processing in the brains of Parkinsonian patients , 2000, Neuroreport.

[15]  W. Schultz,et al.  Discrete Coding of Reward Probability and Uncertainty by Dopamine Neurons , 2003, Science.

[16]  E. Procyk,et al.  Characterization of serial order encoding in the monkey anterior cingulate sulcus , 2001, The European journal of neuroscience.

[17]  I. Daum,et al.  Learning‐related changes in reward expectancy are reflected in the feedback‐related negativity , 2008, The European journal of neuroscience.

[18]  Vernon L. Smith,et al.  Papers in experimental economics , 1991 .

[19]  M. Frank,et al.  Anatomy of a decision: striato-orbitofrontal interactions in reinforcement learning, decision making, and reversal. , 2006, Psychological review.

[20]  Peter Dayan,et al.  A Neural Substrate of Prediction and Reward , 1997, Science.

[21]  Timothy E. J. Behrens,et al.  Learning the value of information in an uncertain world , 2007, Nature Neuroscience.

[22]  Terrence J. Sejnowski,et al.  An Information-Maximization Approach to Blind Separation and Blind Deconvolution , 1995, Neural Computation.

[23]  L. Clark Decision-making during gambling: an integration of cognitive and psychobiological approaches , 2010, Philosophical Transactions of the Royal Society B: Biological Sciences.

[24]  P. Tobler,et al.  Segregated and integrated coding of reward and punishment in the cingulate cortex. , 2009, Journal of neurophysiology.

[25]  Emmanuel Procyk,et al.  Frontal Feedback-Related Potentials in Nonhuman Primates: Modulation during Learning and under Haloperidol , 2009, The Journal of Neuroscience.

[26]  T. Robbins,et al.  Defining the Neural Mechanisms of Probabilistic Reversal Learning Using Event-Related Functional Magnetic Resonance Imaging , 2002, The Journal of Neuroscience.

[27]  H. Poizner,et al.  Probabilistic reversal learning is impaired in Parkinson's disease , 2009, Neuroscience.

[28]  Adrian R. Willoughby,et al.  The Medial Frontal Cortex and the Rapid Processing of Monetary Gains and Losses , 2002, Science.

[29]  K. Doya Modulators of decision making , 2008, Nature Neuroscience.

[30]  S. Debener,et al.  Trial-by-Trial Fluctuations in the Event-Related Electroencephalogram Reflect Dynamic Changes in the Degree of Surprise , 2008, The Journal of Neuroscience.

[31]  R. Oostenveld,et al.  Validating the boundary element method for forward and inverse EEG computations in the presence of a hole in the skull , 2002, Human brain mapping.

[32]  W. Schultz Dopamine neurons and their role in reward mechanisms , 1997, Current Opinion in Neurobiology.

[33]  W. Dauer,et al.  Parkinson's Disease Mechanisms and Models , 2003, Neuron.

[34]  Clay B. Holroyd,et al.  The neural basis of human error processing: reinforcement learning, dopamine, and the error-related negativity. , 2002, Psychological review.

[35]  E. Procyk,et al.  Behavioral Shifts and Action Valuation in the Anterior Cingulate Cortex , 2008, Neuron.

[36]  Markus Ullsperger,et al.  Neuropharmacology of performance monitoring , 2009, Neuroscience & Biobehavioral Reviews.

[37]  J. Missimer,et al.  Reward mechanisms in the brain and their role in dependence: evidence from neurophysiological and neuroimaging studies , 2001, Brain Research Reviews.

[38]  R. C. Oldfield THE ASSESSMENT AND ANALYSIS OF HANDEDNESS , 1971 .

[39]  E. Procyk,et al.  Anterior cingulate error‐related activity is modulated by predicted reward , 2005, The European journal of neuroscience.

[40]  Clay B. Holroyd,et al.  It's worse than you thought: the feedback negativity and violations of reward prediction in gambling tasks. , 2007, Psychophysiology.

[41]  R. C. Oldfield The assessment and analysis of handedness: the Edinburgh inventory. , 1971, Neuropsychologia.

[42]  Peter Ullsperger,et al.  Dissociable medial frontal negativities from a common monitoring system for self- and externally caused failure of goal achievement , 2009, NeuroImage.