Models that learn how humans learn: The case of decision-making and its disorders

Popular computational models of decision-making make specific assumptions about learning processes that may cause them to underfit observed behaviours. Here we suggest an alternative method using recurrent neural networks (RNNs) to generate a flexible family of models that have sufficient capacity to represent the complex learning and decision- making strategies used by humans. In this approach, an RNN is trained to predict the next action that a subject will take in a decision-making task and, in this way, learns to imitate the processes underlying subjects’ choices and their learning abilities. We demonstrate the benefits of this approach using a new dataset drawn from patients with either unipolar (n = 34) or bipolar (n = 33) depression and matched healthy controls (n = 34) making decisions on a two-armed bandit task. The results indicate that this new approach is better than baseline reinforcement-learning methods in terms of overall performance and its capacity to predict subjects’ choices. We show that the model can be interpreted using off-policy simulations and thereby provides a novel clustering of subjects’ learning processes—something that often eludes traditional approaches to modelling and behavioural analysis.

[1]  V. L. Senders,et al.  Analysis of response sequences in the setting of a psychophysical experiment. , 1952, The American journal of psychology.

[2]  V. L. Senders Further analysis of response sequences in the setting of a psychophysical experiment. , 1953, The American journal of psychology.

[3]  C. I. Howarth,et al.  Non-Random Sequences in Visual Threshold Experiments , 1956 .

[4]  W. S. Verplanck,et al.  Randomized stimuli and the non-independence of successive responses at the visual threshold. , 1958, The Journal of general psychology.

[5]  M. Hamilton A RATING SCALE FOR DEPRESSION , 1960, Journal of neurology, neurosurgery, and psychiatry.

[6]  H. Akaike,et al.  Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .

[7]  R. C. Young,et al.  A Rating Scale for Mania: Reliability, Validity and Sensitivity , 1978, British Journal of Psychiatry.

[8]  M. Treisman,et al.  A theory of criterion setting with an application to sequential dependencies. , 1984 .

[9]  W. Cleveland,et al.  Locally Weighted Regression: An Approach to Regression Analysis by Local Fitting , 1988 .

[10]  H. Goldman,et al.  Revising axis V for DSM-IV: a review of measures of social functioning. , 1992, The American journal of psychiatry.

[11]  Hava T. Siegelmann,et al.  On the Computational Power of Neural Nets , 1995, J. Comput. Syst. Sci..

[12]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[13]  Sepp Hochreiter,et al.  Learning to Learn Using Gradient Descent , 2001, ICANN.

[14]  J. Busemeyer,et al.  A contribution of cognitive decision models to clinical assessment: decomposing performance on the Bechara gambling task. , 2002, Psychological assessment.

[15]  Bradley P. Carlin,et al.  Bayesian measures of model complexity and fit , 2002 .

[16]  A. Linde DIC in variable selection , 2005 .

[17]  P. Glimcher,et al.  JOURNAL OF THE EXPERIMENTAL ANALYSIS OF BEHAVIOR 2005, 84, 555–579 NUMBER 3(NOVEMBER) DYNAMIC RESPONSE-BY-RESPONSE MODELS OF MATCHING BEHAVIOR IN RHESUS MONKEYS , 2022 .

[18]  D. Kupfer,et al.  Acute and Longer- Term Outcomes in Depressed Outpatients Requiring One or Several Treatment Steps: A STAR*D Report , 2006 .

[19]  J. O'Doherty,et al.  Model‐Based fMRI and Its Application to Reward Learning and Decision Making , 2007, Annals of the New York Academy of Sciences.

[20]  J. Gold,et al.  The neural basis of decision making. , 2007, Annual review of neuroscience.

[21]  Paul R. Schrater,et al.  Structure Learning in Human Sequential Decision-Making , 2008, NIPS.

[22]  P. Dayan,et al.  Reinforcement learning: The Good, The Bad and The Ugly , 2008, Current Opinion in Neurobiology.

[23]  Jonathan D. Cohen,et al.  Sequential effects: Superstition or rational behavior? , 2008, NIPS.

[24]  Karl J. Friston,et al.  Bayesian model selection for group studies , 2009, NeuroImage.

[25]  Matt Jones,et al.  Sequential Effects Reflect Parallel Learning of Multiple Environmental Regularities , 2022 .

[26]  Karl J. Friston,et al.  Bayesian model selection for group studies (vol 46, pg 1005, 2009) , 2009 .

[27]  K. Doya,et al.  Validation of Decision-Making Models and Analysis of Decision Variables in the Rat Basal Ganglia , 2009, The Journal of Neuroscience.

[28]  Jung Hoon Sul,et al.  Role of Striatum in Updating Values of Chosen Actions , 2009, The Journal of Neuroscience.

[29]  Sumio Watanabe,et al.  Asymptotic Equivalence of Bayes Cross Validation and Widely Applicable Information Criterion in Singular Learning Theory , 2010, J. Mach. Learn. Res..

[30]  Paul R. Schrater,et al.  Within- and Cross-Modal Distance Information Disambiguate Visual Size-Change Perception , 2010, PLoS Comput. Biol..

[31]  M. Treisman,et al.  A criterion setting theory of discrimination learning that accounts for anisotropies and context effects. , 2010, Seeing and perceiving.

[32]  Nathaniel D. Daw,et al.  Trial-by-trial data analysis using computational models , 2011 .

[33]  D. Buonomano,et al.  Complexity without chaos: Plasticity within random recurrent networks generates robust timing and motor control , 2012, 1210.2104.

[34]  Karl J. Friston,et al.  Computational psychiatry , 2012, Trends in Cognitive Sciences.

[35]  Katarzyna Jaworska,et al.  How Predictable are “Spontaneous Decisions” and “Hidden Intentions”? Comparing Classification Results Based on Previous Responses with Multivariate Pattern Analysis of fMRI BOLD Signals , 2012, Front. Psychology.

[36]  Karl J. Friston,et al.  Erratum: Computational psychiatry [Trends in Cognitive Sciences 16 (2012), 72–80] , 2012, Trends in Cognitive Sciences.

[37]  L. Abbott,et al.  From fixed points to chaos: Three models of delayed discrimination , 2013, Progress in Neurobiology.

[38]  W. Newsome,et al.  Context-dependent computation by recurrent dynamics in prefrontal cortex , 2013, Nature.

[39]  Dean V. Buonomano,et al.  ROBUST TIMING AND MOTOR PATTERNS BY TAMING CHAOS IN RECURRENT NEURAL NETWORKS , 2012, Nature Neuroscience.

[40]  Aki Vehtari,et al.  Understanding predictive information criteria for Bayesian models , 2013, Statistics and Computing.

[41]  A. Moustafa,et al.  Impulse Control Disorders in Parkinson's Disease Are Associated with Dysfunction in Stimulus Valuation But Not Action Valuation , 2014, The Journal of Neuroscience.

[42]  D. Bates,et al.  Fitting Linear Mixed-Effects Models Using lme4 , 2014, 1406.5823.

[43]  Eric Chown,et al.  Cognitive Modeling , 2014, Computing Handbook, 3rd ed..

[44]  J. Macke,et al.  Quantifying the effect of intertrial dependence on perceptual decisions. , 2014, Journal of vision.

[45]  A. Young,et al.  Orientation-sensitivity to facial features explains the Thatcher illusion. , 2014, Journal of vision.

[46]  W. Gerstner,et al.  Optimal Control of Transient Dynamics in Balanced Networks Supports Generation of Complex Movements , 2014, Neuron.

[47]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[48]  Fei-Fei Li,et al.  Visualizing and Understanding Recurrent Networks , 2015, ArXiv.

[49]  Joshua B. Tenenbaum,et al.  Human-level concept learning through probabilistic program induction , 2015, Science.

[50]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[51]  N. Parga,et al.  Dynamic Control of Response Criterion in Premotor Cortex during Perceptual Detection under Temporal Uncertainty , 2015, Neuron.

[52]  Matthew T. Kaufman,et al.  A neural network that finds a naturalistic solution for the production of muscle activity , 2015, Nature Neuroscience.

[53]  Guangyu R. Yang,et al.  Training Excitatory-Inhibitory Recurrent Neural Networks for Cognitive Tasks: A Simple and Flexible Framework , 2016, PLoS Comput. Biol..

[54]  David W. Smith,et al.  Adaptive Remodeling of Achilles Tendon: A Multi-scale Computational Model , 2016, PLoS Comput. Biol..

[55]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[56]  Christopher D. Harvey,et al.  Recurrent Network Models of Sequence Generation and Memory , 2016, Neuron.

[57]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[58]  Thomas Miconi,et al.  Biologically plausible learning in recurrent neural networks reproduces neural dynamics observed during cognitive tasks , 2016, bioRxiv.

[59]  Kevin J. Miller,et al.  Dorsal hippocampus contributes to model-based planning , 2017, Nature Neuroscience.

[60]  Xiao-Jing Wang,et al.  Reward-based training of recurrent neural networks for cognitive and value-based tasks , 2016, bioRxiv.

[61]  H. Francis Song,et al.  Clustering and compositionality of task representations in a neural network trained to perform many cognitive tasks , 2017, bioRxiv.

[62]  Per B. Brockhoff,et al.  lmerTest Package: Tests in Linear Mixed Effects Models , 2017 .

[63]  Omri Barak,et al.  Recurrent neural networks as versatile tools of neuroscience research , 2017, Current Opinion in Neurobiology.

[64]  Winfried Denk,et al.  EM connectomics reveals axonal target variation in a sequence-generating network , 2017, eLife.

[65]  Peter Dayan,et al.  Integrated accounts of behavioral and neuroimaging data using flexible recurrent neural network models , 2018, bioRxiv.

[66]  Zhewei Zhang,et al.  A neural network model for the orbitofrontal cortex and task space acquisition during reinforcement learning , 2017, bioRxiv.

[67]  Tom Heskes,et al.  Hierarchical Bayesian inference for concurrent model fitting and comparison for group studies , 2018, bioRxiv.