Disentangled behavioral representations

Individual characteristics in human decision-making are often quantified by fitting a parametric cognitive model to subjects’ behavior and then studying differences between them in the associated parameter space. However, these models often fit behavior more poorly than recurrent neural networks (RNNs), which are more flexible and make fewer assumptions about the underlying decision-making processes. Unfortunately, the parameter and latent activity spaces of RNNs are generally high-dimensional and uninterpretable, making it hard to use them to study individual differences. Here, we show how to benefit from the flexibility of RNNs while representing individual differences in a low-dimensional and interpretable space. To achieve this, we propose a novel end-to-end learning framework in which an encoder is trained to map the behavior of subjects into a low-dimensional latent space. These low-dimensional representations are used to generate the parameters of individual RNNs corresponding to the decision-making process of each subject. We introduce terms into the loss function that ensure that the latent dimensions are informative and disentangled, i.e., encouraged to have distinct effects on behavior. This allows them to align with separate facets of individual differences. We illustrate the performance of our framework on synthetic data as well as a dataset including the behavior of patients with psychiatric disorders.

[1]  Max Welling,et al.  Improved Variational Inference with Inverse Autoregressive Flow , 2016, NIPS 2016.

[2]  J. Busemeyer,et al.  A contribution of cognitive decision models to clinical assessment: decomposing performance on the Bechara gambling task. , 2002, Psychological assessment.

[3]  Bernhard Schölkopf,et al.  Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations , 2018, ICML.

[4]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[5]  S. Wood Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models , 2011 .

[6]  B. Franke,et al.  Dissociable Effects of Dopamine and Serotonin on Reversal Learning , 2013, Neuron.

[7]  Xiao-Jing Wang,et al.  Task representations in neural networks trained to perform many cognitive tasks , 2019, Nature Neuroscience.

[8]  Shun-ichi Amari,et al.  Methods of information geometry , 2000 .

[9]  Jerome R. Busemeyer,et al.  Using Cognitive Models to Map Relations Between Neuropsychological Disorders and Human Decision-Making Deficits , 2005, Psychological science.

[10]  T. Robbins,et al.  Decision Making, Affect, and Learning: Attention and Performance XXIII , 2011 .

[11]  Michael I. Jordan,et al.  A Sticky HDP-HMM With Application to Speaker Diarization , 2009, 0905.2592.

[12]  Yoshua Bengio,et al.  Série Scientifique Scientific Series Incorporating Second-order Functional Knowledge for Better Option Pricing Incorporating Second-order Functional Knowledge for Better Option Pricing , 2022 .

[13]  Xiao-Jing Wang,et al.  Reward-based training of recurrent neural networks for cognitive and value-based tasks , 2016, bioRxiv.

[14]  Bernard W. Balleine,et al.  Models that learn how humans learn: the case of depression and bipolar disorders , 2018, bioRxiv.

[15]  Peter Dayan,et al.  Probabilistic Meta-Representations Of Neural Networks , 2018, ArXiv.

[16]  Ben J. A. Kröse,et al.  Learning from delayed rewards , 1995, Robotics Auton. Syst..

[17]  David S. Lorberbaum,et al.  Genetic evidence that Nkx2.2 acts primarily downstream of Neurog3 in pancreatic endocrine lineage development , 2017, eLife.

[18]  C. Villani Optimal Transport: Old and New , 2008 .

[19]  Ryan P. Adams,et al.  Composing graphical models with neural networks for structured representations and fast inference , 2016, NIPS.

[20]  Inderjit S. Dhillon,et al.  Low-Rank Kernel Learning with Bregman Matrix Divergences , 2009, J. Mach. Learn. Res..

[21]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[22]  Hava T. Siegelmann,et al.  On the Computational Power of Neural Nets , 1995, J. Comput. Syst. Sci..

[23]  Zeb Kurth-Nelson,et al.  Learning to reinforcement learn , 2016, CogSci.

[24]  Chethan Pandarinath,et al.  Inferring single-trial neural population dynamics using sequential auto-encoders , 2017, Nature Methods.

[25]  Kuldip K. Paliwal,et al.  Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..

[26]  T. Griffiths,et al.  Modeling individual differences using Dirichlet processes , 2006 .

[27]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[28]  Nathaniel D. Daw,et al.  Trial-by-trial data analysis using computational models , 2011 .

[29]  P. Glimcher,et al.  JOURNAL OF THE EXPERIMENTAL ANALYSIS OF BEHAVIOR 2005, 84, 555–579 NUMBER 3(NOVEMBER) DYNAMIC RESPONSE-BY-RESPONSE MODELS OF MATCHING BEHAVIOR IN RHESUS MONKEYS , 2022 .

[30]  M. Frank,et al.  Prefrontal and striatal dopaminergic genes predict individual differences in exploration and exploitation. , 2009, Nature neuroscience.

[31]  Peter Dayan,et al.  Integrated accounts of behavioral and neuroimaging data using flexible recurrent neural network models , 2018, bioRxiv.

[32]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[33]  Samy Bengio,et al.  Generating Sentences from a Continuous Space , 2015, CoNLL.

[34]  Shun-ichi Amari,et al.  Differential-geometrical methods in statistics , 1985 .

[35]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[36]  John B. Carroll,et al.  Individual differences in cognitive abilities. , 1979, Annual review of psychology.