Reinforcement learning, conditioning, and the brain: Successes and challenges

[1]  E. Rolls,et al.  The Orbitofrontal Cortex , 2019 .

[2]  M. Commons The effect of delay and of intervening events on reinforcement value , 2013 .

[3]  M. Botvinick,et al.  Hierarchically organized behavior and its neural foundations: A reinforcement learning perspective , 2009, Cognition.

[4]  O. Hikosaka,et al.  Two types of dopamine neuron distinctly convey positive and negative motivational signals , 2009, Nature.

[5]  M. Ungless,et al.  Phasic excitation of dopamine neurons in ventral VTA by noxious stimuli , 2009, Proceedings of the National Academy of Sciences.

[6]  Colin Camerer,et al.  Neural Response to Reward Anticipation under Risk Is Nonlinear in Probabilities , 2009, The Journal of Neuroscience.

[7]  Richard S. Sutton,et al.  Stimulus Representation and the Timing of Reward-Prediction Errors in Models of the Dopamine System , 2008, Neural Computation.

[8]  David H. Zald,et al.  The Orbitofrontal Cortex , 2008 .

[9]  George I. Christopoulos,et al.  Neuronal Distortions of Reward Probability without Choice , 2008, The Journal of Neuroscience.

[10]  D. Bullock,et al.  A Local Circuit Model of Learned Striatal and Dopamine Cell Responses under Probabilistic Schedules of Reward , 2008, The Journal of Neuroscience.

[11]  Colin Camerer,et al.  Explicit neural signals reflecting reward uncertainty , 2008, Philosophical Transactions of the Royal Society B: Biological Sciences.

[12]  Joseph J. Paton,et al.  Moment-to-Moment Tracking of State Value in the Amygdala , 2008, The Journal of Neuroscience.

[13]  B. Sahakian,et al.  Acute Tryptophan Depletion in Healthy Volunteers Enhances Punishment Prediction but Does not Affect Reward Prediction , 2008, Neuropsychopharmacology.

[14]  W. Schultz,et al.  Influence of Reward Delays on Responses of Dopamine Neurons , 2008, The Journal of Neuroscience.

[15]  Daeyeol Lee,et al.  Prefrontal Coding of Temporally Discounted Values during Intertemporal Choice , 2008, Neuron.

[16]  Y. Niv,et al.  Dialogues on prediction errors , 2008, Trends in Cognitive Sciences.

[17]  M. Trimble,et al.  The Lateral Habenula: No Longer Neglected , 2008, CNS Spectrums.

[18]  Kae Nakamura,et al.  Reward-Dependent Modulation of Neuronal Activity in the Primate Dorsal Raphe Nucleus , 2008, The Journal of Neuroscience.

[19]  N. Chater,et al.  The probabilistic mind: prospects for Bayesian cognitive science , 2008 .

[20]  Hua Zhao,et al.  Lateral habenula lesions improve the behavioral response in depressed rats via increasing the serotonin level in dorsal raphe nucleus , 2008, Behavioural Brain Research.

[21]  Samuel M. McClure,et al.  BOLD Responses Reflecting Dopaminergic Signals in the Human Ventral Tegmental Area , 2008, Science.

[22]  Gregory S. Berns,et al.  Nonlinear neurobiological probability weighting functions for aversive outcomes , 2008, NeuroImage.

[23]  P. Glimcher,et al.  The neural correlates of subjective value during intertemporal choice , 2007, Nature Neuroscience.

[24]  B. Richmond,et al.  A Comparison of Reward‐Contingent Neuronal Activity in Monkey Orbitofrontal Cortex and Ventral Striatum , 2007, Annals of the New York Academy of Sciences.

[25]  N. Rempel-Clower,et al.  Role of Orbitofrontal Cortex Connections in Emotion , 2007, Annals of the New York Academy of Sciences.

[26]  Matthijs A. A. van der Meer,et al.  Integrating hippocampus and striatum in decision-making , 2007, Current Opinion in Neurobiology.

[27]  Seth J. Ramus,et al.  Linking affect to action: critical contributions of the orbitofrontal cortex. Preface. , 2007, Annals of the New York Academy of Sciences.

[28]  N. Daw,et al.  Reinforcement Learning Signals in the Human Striatum Distinguish Learners from Nonlearners during Reward-Based Decision Making , 2007, The Journal of Neuroscience.

[29]  A. Kacelnik Normative and descriptive models of decision making: time discounting and risk sensitivity. , 2007, Ciba Foundation symposium.

[30]  P. Glimcher,et al.  Statistics of midbrain dopamine neuron spike trains in the awake primate. , 2007, Journal of neurophysiology.

[31]  R. Wightman,et al.  Associative learning mediates dynamic shifts in dopamine signaling in the nucleus accumbens , 2007, Nature Neuroscience.

[32]  J. E. Mazur Choice in a successive-encounters procedure and hyperbolic decay of reinforcement. , 2007, Journal of The Experimental Analysis of Behavior.

[33]  O. Hikosaka,et al.  Lateral habenula as a source of negative reward signals in dopamine neurons , 2007, Nature.

[34]  P. Shepard,et al.  Lateral Habenula Stimulation Inhibits Rat Midbrain Dopamine Neurons through a GABAA Receptor-Mediated Mechanism , 2007, The Journal of Neuroscience.

[35]  D. S. Zahm,et al.  Glutamatergic Afferents of the Ventral Tegmental Area in the Rat , 2007, The Journal of Neuroscience.

[36]  Adam Johnson,et al.  A Computational Model of Craving and Obsession , 2007, Annals of the New York Academy of Sciences.

[37]  K. Preuschoff,et al.  Adding Prediction Risk to the Theory of Reward Learning , 2007, Annals of the New York Academy of Sciences.

[38]  J. Wickens,et al.  Striatal Contributions to Reward and Decision Making , 2007 .

[39]  K. Doya,et al.  Current trends in decision making. , 2007, Annals of the New York Academy of Sciences.

[40]  J. O'Doherty,et al.  Neural coding of reward-prediction error signals during classical conditioning with attractive faces. , 2007, Journal of neurophysiology.

[41]  Brian Knutson,et al.  Linking nucleus accumbens dopamine and blood oxygenation , 2007, Psychopharmacology.

[42]  J. O'Doherty,et al.  Reward Value Coding Distinct From Risk Attitude-Related Uncertainty Coding in Human Reward Systems , 2006, Journal of neurophysiology.

[43]  R. Dolan,et al.  Dopamine-dependent prediction errors underpin reward-seeking behaviour in humans , 2006, Nature.

[44]  S. Quartz,et al.  Neural Differentiation of Expected Reward and Risk in Human Subcortical Structures , 2006, Neuron.

[45]  L. Peoples,et al.  Firing patterns of accumbal neurons during a pavlovian-conditioned approach task. , 2006, Journal of neurophysiology.

[46]  David S. Touretzky,et al.  Representation and Timing in Theories of the Dopamine System , 2006, Neural Computation.

[47]  Jesse Hoey,et al.  An analytic solution to discrete Bayesian reinforcement learning , 2006, ICML.

[48]  H. Yin,et al.  The role of the basal ganglia in habit formation , 2006, Nature Reviews Neuroscience.

[49]  Henrik Walter,et al.  Prediction error as a linear function of reward probability is coded in human nucleus accumbens , 2006, NeuroImage.

[50]  Lawrence R. Frank,et al.  Anterior cingulate activity modulates nonlinear decision weight function of uncertain prospects , 2006, NeuroImage.

[51]  R. Poldrack,et al.  Ventral–striatal/nucleus–accumbens sensitivity to prediction errors during classification learning , 2006, Human brain mapping.

[52]  Evan M. Gordon,et al.  Neural Signatures of Economic Preferences for Risk and Ambiguity , 2006, Neuron.

[53]  Joseph J. Paton,et al.  The primate amygdala represents the positive and negative value of visual stimuli during learning , 2006, Nature.

[54]  P. Dayan,et al.  Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control , 2005, Nature Neuroscience.

[55]  A. Graybiel,et al.  Activity of striatal neurons reflects dynamic encoding and recoding of procedural memories , 2005, Nature.

[56]  M. Roesch,et al.  Orbitofrontal Cortex, Associative Learning, and Expectancies , 2005, Neuron.

[57]  M. Platt,et al.  Risk-sensitive neurons in macaque posterior cingulate cortex , 2005, Nature Neuroscience.

[58]  A. Parent,et al.  The striatofugal fiber system in primates: a reevaluation of its organization based on single-axon tracing studies. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[59]  P. Glimcher,et al.  Midbrain Dopamine Neurons Encode a Quantitative Reward Prediction Error Signal , 2005, Neuron.

[60]  A M Graybiel,et al.  Time-varying covariance of neural activities recorded in striatum and frontal cortex as monkeys perform sequential-saccade tasks. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[61]  W. Schultz,et al.  Behavioral and Brain Functions , 2005 .

[62]  P. Dayan,et al.  Dopamine, uncertainty and TD learning , 2005, Behavioral and Brain Functions.

[63]  R. Poldrack,et al.  Prospect theory on the brain? Toward a cognitive neuroscience of decision under risk. , 2005, Brain research. Cognitive brain research.

[64]  W. Schultz,et al.  Adaptive Coding of Reward Value by Dopamine Neurons , 2005, Science.

[65]  George Loewenstein,et al.  CHAPTER ONE. Behavioral Economics: Past, Present, Future , 2004 .

[66]  Samuel M. McClure,et al.  Separate Neural Systems Value Immediate and Delayed Monetary Rewards , 2004, Science.

[67]  L. Green,et al.  A discounting framework for choice with delayed and probabilistic rewards. , 2004, Psychological bulletin.

[68]  M. Gluck,et al.  Human midbrain sensitivity to cognitive feedback and uncertainty during classification learning. , 2004, Journal of neurophysiology.

[69]  R. Hertwig,et al.  Decisions from Experience and the Effect of Rare Events in Risky Choice , 2004, Psychological science.

[70]  E. Vaadia,et al.  Coincident but Distinct Messages of Midbrain Dopamine and Striatal Tonically Active Neurons , 2004, Neuron.

[71]  Karl J. Friston,et al.  Dissociable Roles of Ventral and Dorsal Striatum in Instrumental Conditioning , 2004, Science.

[72]  M. Gluck,et al.  Cortico-striatal contributions to feedback-based learning: converging data from neuroimaging and neuropsychology. , 2004, Brain : a journal of neurology.

[73]  D. Plaut,et al.  Doing without schema hierarchies: a recurrent connectionist approach to normal and impaired routine sequential action. , 2004, Psychological review.

[74]  J. Bolam,et al.  Uniform Inhibition of Dopamine Neurons in the Ventral Tegmental Area by Aversive Stimuli , 2004, Science.

[75]  Rebecca Elliott,et al.  Instrumental responding for rewards is associated with enhanced neuronal response in subcortical reward systems , 2004, NeuroImage.

[76]  M. Delgado,et al.  Modulation of Caudate Activity by Action Contingency , 2004, Neuron.

[77]  O. Hikosaka,et al.  Dopamine Neurons Can Represent Context-Dependent Prediction Error , 2004, Neuron.

[78]  S. Haber The primate basal ganglia: parallel and integrative networks , 2003, Journal of Chemical Neuroanatomy.

[79]  Shie Mannor,et al.  Bayes Meets Bellman: The Gaussian Process Approach to Temporal Difference Learning , 2003, ICML.

[80]  Nigel H. Goddard,et al.  A neural model of frontostriatal interactions for behavioural planning and action chunking , 2003, Neurocomputing.

[81]  G. Schoenbaum,et al.  Neural Encoding in Ventral Striatum during Olfactory Discrimination Learning , 2003, Neuron.

[82]  Karl J. Friston,et al.  Temporal Difference Models and Reward-Related Learning in the Human Brain , 2003, Neuron.

[83]  Samuel M. McClure,et al.  Temporal Prediction Errors in a Passive Learning Task Activate Human Striatum , 2003, Neuron.

[84]  T. Jay Dopamine: a potential substrate for synaptic plasticity and memory mechanisms , 2003, Progress in Neurobiology.

[85]  S. Killcross,et al.  Coordination of actions and habits in the medial prefrontal cortex of rats. , 2003, Cerebral cortex.

[86]  W. Schultz,et al.  Discrete Coding of Reward Probability and Uncertainty by Dopamine Neurons , 2003, Science.

[87]  W. Schultz Getting Formal with Dopamine and Reward , 2002, Neuron.

[88]  J. Holzer Frontal-Subcortical Circuits in Psychiatric and Neurological Disorders , 2002 .

[89]  Sham M. Kakade,et al.  Opponent interactions between serotonin and dopamine , 2002, Neural Networks.

[90]  Roland E. Suri,et al.  TD models of reward predictive responses in dopamine neurons , 2002, Neural Networks.

[91]  G. Loewenstein,et al.  Time Discounting and Time Preference: A Critical Review , 2002 .

[92]  John N. J. Reynolds,et al.  Dopamine-dependent plasticity of corticostriatal synapses , 2002, Neural Networks.

[93]  Eytan Ruppin,et al.  Actor-critic models of the basal ganglia: new anatomical and computational perspectives , 2002, Neural Networks.

[94]  B. Everitt,et al.  Emotion and motivation: the role of the amygdala, ventral striatum, and prefrontal cortex , 2002, Neuroscience & Biobehavioral Reviews.

[95]  Kyle L Kirkland,et al.  High-Tech Brains: A History of Technology-Based Analogies and Models of Nerve and Brain Function , 2002, Perspectives in biology and medicine.

[96]  P. Montague,et al.  Activity in human ventral striatum locked to errors of reward prediction , 2002, Nature Neuroscience.

[97]  N. Logothetis,et al.  Neurophysiological investigation of the basis of the fMRI signal , 2001, Nature.

[98]  W. Schultz Multiple reward signals in the brain , 2000, Nature Reviews Neuroscience.

[99]  A. Grace,et al.  The tonic/phasic model of dopamine system regulation and its implications for understanding alcohol and psychostimulant craving. , 2000, Addiction.

[100]  Leslie Pack Kaelbling,et al.  Practical Reinforcement Learning in Continuous Spaces , 2000, ICML.

[101]  W. Schultz,et al.  Reward-related neuronal activity during go-nogo task performance in primate orbitofrontal cortex. , 2000, Journal of neurophysiology.

[102]  J. Hollerman,et al.  Reward processing in primate orbitofrontal cortex and basal ganglia. , 2000, Cerebral cortex.

[103]  J. Horvitz Mesolimbocortical and nigrostriatal dopamine responses to salient non-reward events , 2000, Neuroscience.

[104]  K. Hikosaka,et al.  Delay activity of orbital and lateral prefrontal neurons of the monkey varying with different rewards. , 2000, Cerebral cortex.

[105]  C. Cavada,et al.  The anatomical connections of the macaque monkey orbitofrontal cortex. A review. , 2000, Cerebral cortex.

[106]  D. Joel,et al.  The connections of the dopaminergic system with the striatum in rats and primates: an analysis with respect to the functional and compartmental organization of the striatum , 2000, Neuroscience.

[107]  C. I. Connolly,et al.  Building neural representations of habits. , 1999, Science.

[108]  S. Mobini,et al.  Theory and method in the quantitative analysis of ”impulsive choice” behaviour: implications for psychopharmacology , 1999, Psychopharmacology.

[109]  Doina Precup,et al.  Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[110]  W. Schultz,et al.  A neural network model with dopamine-like reinforcement signal that learns a spatial delayed response task , 1999, Neuroscience.

[111]  T. Gray Functional and Anatomical Relationships among the Amygdala, Basal Forebrain, Ventral Striatum, and Cortex: An Integrative Discussion , 1999, Annals of the New York Academy of Sciences.

[112]  W. Schultz,et al.  Relative reward preference in primate orbitofrontal cortex , 1999, Nature.

[113]  Ted O’Donoghue,et al.  Doing It Now or Later , 1999 .

[114]  F. Guarraci,et al.  An electrophysiological characterization of ventral tegmental area dopaminergic neurons during differential pavlovian fear conditioning in the awake rabbit , 1999, Behavioural Brain Research.

[115]  J. Price,et al.  Prefrontal cortical projections to the hypothalamus in Macaque monkeys , 1998, The Journal of comparative neurology.

[116]  Peter D. Sozou,et al.  On hyperbolic discounting and uncertain hazard rates , 1998, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[117]  J. Hollerman,et al.  Dopamine neurons report an error in the temporal prediction of reward during learning , 1998, Nature Neuroscience.

[118]  A. Graybiel The Basal Ganglia and Chunking of Action Repertoires , 1998, Neurobiology of Learning and Memory.

[119]  Stuart J. Russell,et al.  Bayesian Q-Learning , 1998, AAAI/IAAI.

[120]  N. Hiroi,et al.  Preferential localization of self-stimulation sites in striosomes/patches in the rat striatum. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[121]  Leslie Pack Kaelbling,et al.  Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[122]  P. Calabresi,et al.  Synaptic plasticity and physiological interactions between dopamine and glutamate in the striatum , 1997, Neuroscience & Biobehavioral Reviews.

[123]  Ashwin Ram,et al.  Experiments with Reinforcement Learning in Problems with Continuous State and Action Spaces , 1997, Adapt. Behav..

[124]  H. de Wit,et al.  Determination of discount functions in rats with an adjusting-amount procedure. , 1997, Journal of the experimental analysis of behavior.

[125]  David I. Laibson,et al.  Golden Eggs and Hyperbolic Discounting , 1997 .

[126]  Peter Dayan,et al.  A Neural Substrate of Prediction and Reward , 1997, Science.

[127]  V. Grutta,et al.  Lateral habenular influence on dorsal raphe neurons , 1996, Brain Research Bulletin.

[128]  Jennifer A. Mangels,et al.  A Neostriatal Habit Learning System in Humans , 1996, Science.

[129]  A. Benabid,et al.  Simultaneous Recording of Spontaneous Activities and Nociceptive Responses from Neurons in the Pars Compacta of Substantia Nigra and in the Lateral Habenula , 1996, The European journal of neuroscience.

[130]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[131]  P. Dayan,et al.  A framework for mesencephalic dopamine systems based on predictive Hebbian learning , 1996, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[132]  G. Loewenstein Out of control: Visceral influences on behavior , 1996 .

[133]  W. Schultz,et al.  Preferential activation of midbrain dopamine neurons by appetitive rather than aversive stimuli , 1996, Nature.

[134]  Kenji Doya,et al.  Temporal Difference Learning in Continuous Time and Space , 1995, NIPS.

[135]  L. Green,et al.  Discounting of delayed rewards: Models of individual choice. , 1995, Journal of the experimental analysis of behavior.

[136]  Marvin Minsky,et al.  Steps toward Artificial Intelligence , 1995, Proceedings of the IRE.

[137]  A. Graybiel,et al.  Highly restricted origin of prefrontal cortical inputs to striosomes in the macaque monkey , 1995, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[138]  W. Schultz,et al.  Importance of unpredictability for reward responses in primate dopamine neurons. , 1994, Journal of neurophysiology.

[139]  Leslie Pack Kaelbling,et al.  Acting Optimally in Partially Observable Stochastic Domains , 1994, AAAI.

[140]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[141]  W. Schultz,et al.  Neuronal activity in monkey ventral striatum related to the expectation of reward , 1992, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[142]  A. Tversky,et al.  Advances in prospect theory: Cumulative representation of uncertainty , 1992 .

[143]  C. Geula,et al.  Cytoarchitecture and neural afferents of orbitofrontal cortex in the brain of the monkey , 1992, The Journal of comparative neurology.

[144]  A. Grace Phasic versus tonic dopamine release and the modulation of dopamine system responsivity: A hypothesis for the etiology of schizophrenia , 1991, Neuroscience.

[145]  G. Loewenstein,et al.  Decision making over time and under uncertainty: a common approach , 1991 .

[146]  D. Cross,et al.  Subjective probability and delay. , 1991, Journal of the experimental analysis of behavior.

[147]  M. Gabriel,et al.  Learning and Computational Neuroscience: Foundations of Adaptive Networks , 1990 .

[148]  A. Graybiel Neurotransmitters and neuromodulators in the basal ganglia , 1990, Trends in Neurosciences.

[149]  R. Strecker,et al.  Regulation of striatal serotonin release by the lateral habenula-dorsal raphe pathway in the rat as demonstrated by in vivo microdialysis: role of excitatory amino acids and GABA , 1989, Brain Research.

[150]  J. Glowinski,et al.  Effect of noxious tail pinch on the discharge rate of mesocortical and mesolimbic dopamine neurons: selective activation of the mesocortical system , 1989, Brain Research.

[151]  B. Jacobs,et al.  Lack of response of serotonergic neurons in the dorsal raphe nucleus of freely moving cats to stressful stimuli , 1988, Experimental Neurology.

[152]  Richard S. Sutton,et al.  Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[153]  Melburn R. Park Monosynaptic inhibitory postsynaptic potentials from lateral habenula recorded in dorsal raphe neurons , 1987, Brain Research Bulletin.

[154]  R. Oades,et al.  Ventral tegmental (A10) system: neurobiology. 1. Anatomy and connectivity , 1987, Brain Research Reviews.

[155]  K. Wilcox,et al.  Stimulation of the lateral habenula inhibits dopamine-containing neurons in the substantia nigra and ventral tegmental area of the rat , 1986, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[156]  Geoffrey E. Hinton,et al.  Distributed Representations , 1986, The Philosophy of Artificial Intelligence.

[157]  C. Gerfen The neostriatal mosaic. I. compartmental organization of projections from the striatum to the substantia nigra in the rat , 1985, The Journal of comparative neurology.

[158]  A. Dickinson Actions and habits: the development of behavioural autonomy , 1985 .

[159]  J. Elster Ulysses and the Sirens: Studies in Rationality and Irrationality , 1985 .

[160]  B. Frost,et al.  Functional activity in the lateral habenular and dorsal raphe nuclei following administration of several dopamine receptor antagonists. , 1984, Canadian Journal of Physiology and Pharmacology.

[161]  C. Gerfen The neostriatal mosaic: compartmentalization of corticostriatal input and striatonigral output systems , 1984, Nature.

[162]  Richard S. Sutton,et al.  Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[163]  P. Soubrié,et al.  The involvement of nigral serotonin innervation in the control of punishment-induced behavioral inhibition in rats , 1983, Pharmacology Biochemistry and Behavior.

[164]  L. Brown,et al.  A dopamine-sensitive striatal efferent system mapped with [14C]deoxyglucose in the rat , 1983, Brain Research.

[165]  P. Soubrié,et al.  Involvement of lateral habenula-dorsal raphe neurons in the differential regulation of striatal and nigral serotonergic transmission cats , 1982, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[166]  Christopher D. Adams Variations in the Sensitivity of Instrumental Responding to Reinforcer Devaluation , 1982 .

[167]  R. C. Collins,et al.  Metabolic effects of unilateral lesion of the substantia nigra , 1981, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[168]  Flint Schier Ulysses and the Sirens , 1980 .

[169]  John F. Marshall,et al.  Plasticity of [14C]2-deoxy-d-glucose incorporation into neostriatum and related structures in response to dopamine neuron damage and apomorphine replacement , 1980, Brain Research.

[170]  L. Sokoloff,et al.  Influence of dopaminergic systems on the lateral habenular nucleus of the rat , 1980, Brain Research.

[171]  W. Nauta,et al.  Efferent connections of the habenular nuclei in the rat , 1979, The Journal of comparative neurology.

[172]  Warren C. Stern,et al.  Effects of electrical stimulation of the lateral habenula on single-unit activity of raphe neurons , 1979, Experimental Neurology.

[173]  C. W. Ragsdale,et al.  Histochemically distinct compartments in the striatum of human, monkeys, and cat demonstrated by acetylthiocholinesterase staining. , 1978, Proceedings of the National Academy of Sciences of the United States of America.

[174]  S. Iversen,et al.  5-Hydroxytryptamine and punishment , 1977, Nature.

[175]  Ian H. Witten,et al.  An Adaptive Optimal Controller for Discrete-Time Markov Environments , 1977, Inf. Control..

[176]  G. Aghajanian,et al.  Physiological evidence for habenula as major link between forebrain and midbrain raphe. , 1977, Science.

[177]  G. Ainslie Specious reward: a behavioral theory of impulsiveness and impulse control. , 1975, Psychological bulletin.

[178]  A. Battersby Plans and the Structure of Behavior , 1968 .

[179]  G. Miller,et al.  Plans and the structure of behavior , 1960 .

[180]  S. S. Stevens On the psychophysical law. , 1957, Psychological review.

[181]  E. Rowland Theory of Games and Economic Behavior , 1946, Nature.

[182]  P. Samuelson A Note on Measurement of Utility , 1937 .

[183]  E. Thorndike “Animal Intelligence” , 1898, Nature.

[184]  A. Cooper,et al.  Predictive Reward Signal of Dopamine Neurons , 2011 .

[185]  P. Schrimpf,et al.  Dynamic Programming , 2011 .

[186]  O. Hikosaka,et al.  Representation of negative motivational value in the primate lateral habenula , 2009, Nature Neuroscience.

[187]  M. Ungless,et al.  Phasic nociceptive responses in dorsal raphe serotonin neurons , 2008 .

[188]  J. Wickens,et al.  Striatal contributions to reward and decision making: making sense of regional variations in a reiterated processing matrix. , 2007, Annals of the New York Academy of Sciences.

[189]  George E. Monahan,et al.  A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms , 2007 .

[190]  B. Balleine Reward and decision making in corticobasal ganglia networks , 2007 .

[191]  Sanjay Kaul,et al.  Trial and error. How to avoid commonly encountered limitations of published clinical trials. , 2010, Journal of the American College of Cardiology.

[192]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[193]  P. Dayan,et al.  Actions , Policies , Values , and the Basal Ganglia , 2005 .

[194]  Colin Camerer,et al.  Behavioral Economics: Past, Present, Future , 2003 .

[195]  Sridhar Mahadevan,et al.  Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..

[196]  N. Daw,et al.  Reinforcement learning models of the dopamine system and their behavioral implications , 2003 .

[197]  B. Knowlton,et al.  Learning and memory functions of the Basal Ganglia. , 2002, Annual review of neuroscience.

[198]  S. Salloway,et al.  The frontal lobes and neuropsychiatric illness , 2001 .

[199]  J. E. Mazur,et al.  Hyperbolic value addition and general models of animal choice. , 2001, Psychological review.

[200]  A. Dickinson,et al.  Neuronal coding of prediction errors. , 2000, Annual review of neuroscience.

[201]  J. Metcalfe,et al.  A hot/cool-system analysis of delay of gratification: dynamics of willpower. , 1999, Psychological review.

[202]  G. Schoenbaum,et al.  Orbitofrontal cortex and basolateral amygdala encode expected outcomes during learning , 1998, Nature Neuroscience.

[203]  Ronald E. Parr,et al.  Hierarchical control and learning for markov decision processes , 1998 .

[204]  M. Domjan The principles of learning and behavior, 4th ed. , 1998 .

[205]  S. Haber,et al.  The interface between dopamine neurons and the amygdala: implications for schizophrenia. , 1997, Schizophrenia bulletin.

[206]  A. Barto,et al.  Adaptive Critics and the Basal Ganglia , 1994 .

[207]  Joel L. Davis,et al.  Adaptive Critics and the Basal Ganglia , 1995 .

[208]  A. Dickinson CHAPTER 3 – Instrumental Conditioning , 1994 .

[209]  Michael O. Duff,et al.  Reinforcement Learning Methods for Continuous-Time Markov Decision Problems , 1994, NIPS.

[210]  Joel L. Davis,et al.  A Model of How the Basal Ganglia Generate and Use Neural Signals That Predict Reinforcement , 1994 .

[211]  Richard S. Sutton,et al.  Time-Derivative Models of Pavlovian Reinforcement , 1990 .

[212]  W. Schultz,et al.  Responses of nigrostriatal dopamine neurons to high-intensity somatosensory stimulation in the anesthetized monkey. , 1987, Journal of neurophysiology.

[213]  J. E. Mazur An adjusting procedure for studying delayed reinforcement. , 1987 .

[214]  G. E. Alexander,et al.  Parallel organization of functionally segregated circuits linking basal ganglia and cortex. , 1986, Annual review of neuroscience.

[215]  M. Domjan The principles of learning and behavior , 1982 .

[216]  G. Monahan State of the Art—A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms , 1982 .

[217]  A. Tversky,et al.  Prospect theory: analysis of decision under risk , 1979 .

[218]  R. Rescorla,et al.  A theory of Pavlovian conditioning : Variations in the effectiveness of reinforcement and nonreinforcement , 1972 .

[219]  D. Bernoulli Exposition of a New Theory on the Measurement of Risk , 1954 .

[220]  F. W. Irwin Purposive Behavior in Animals and Men , 1932, The Psychological Clinic.

[221]  F. Knight The economic nature of the firm: From Risk, Uncertainty, and Profit , 2009 .