Retrospective model-based inference guides model-free credit assignment
暂无分享,去创建一个
Peter Dayan | Mehdi Keramati | Rani Moran | Raymond J Dolan | P. Dayan | R. Dolan | Mehdi Keramati | R. Moran
[1] Y. Benjamini,et al. Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .
[2] B. Balleine,et al. Human and Rodent Homologies in Action Control: Corticostriatal Determinants of Goal-Directed and Habitual Action , 2010, Neuropsychopharmacology.
[3] P. Dayan,et al. Adaptive integration of habits into depth-limited planning defines a habitual-goal–directed spectrum , 2016, Proceedings of the National Academy of Sciences.
[4] Christopher D. Adams,et al. Instrumental Responding following Reinforcer Devaluation , 1981 .
[5] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[6] Shinsuke Shimojo,et al. Neural Computations Underlying Arbitration between Model-Based and Model-free Learning , 2013, Neuron.
[7] Adam Kepecs,et al. Midbrain Dopamine Neurons Signal Belief in Choice Accuracy during a Perceptual Decision , 2017, Current Biology.
[8] Rani Moran,et al. Old processes, new perspectives: Familiarity is correlated with (not independent of) recollection and is more (not equally) variable for targets than for lures , 2015, Cognitive Psychology.
[9] N. Parga,et al. Dopamine reward prediction error signal codes the temporal evaluation of a perceptual decision report , 2017, Proceedings of the National Academy of Sciences.
[10] S. Gershman,et al. Dopamine reward prediction errors reflect hidden state inference across time , 2017, Nature Neuroscience.
[11] Richard S. Sutton,et al. Reinforcement learning with replacing eligibility traces , 2004, Machine Learning.
[12] Keiji Tanaka,et al. Matching Categorical Object Representations in Inferior Temporal Cortex of Man and Monkey , 2008, Neuron.
[13] Stefan Bode,et al. Intrinsic Valuation of Information in Decision Making under Uncertainty , 2016, PLoS Comput. Biol..
[14] B. Balleine,et al. The role of the dorsomedial striatum in instrumental conditioning , 2005, The European journal of neuroscience.
[15] R. Rescorla. A theory of pavlovian conditioning: The effectiveness of reinforcement and non-reinforcement , 1972 .
[16] Peter Dayan,et al. Dopamine: generalization and bonuses , 2002, Neural Networks.
[17] S. S. Stevens,et al. Handbook of experimental psychology , 1951 .
[18] Amir Dezfouli,et al. Speed/Accuracy Trade-Off between the Habitual and the Goal-Directed Processes , 2011, PLoS Comput. Biol..
[19] S. Killcross,et al. Coordination of actions and habits in the medial prefrontal cortex of rats. , 2003, Cerebral cortex.
[20] Holly C. Miller,et al. Preference for 50% reinforcement over 75% reinforcement by pigeons , 2009, Learning & behavior.
[21] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[22] Ethan S. Bromberg-Martin,et al. Lateral habenula neurons signal errors in the prediction of reward information , 2011, Nature Neuroscience.
[23] B. Balleine,et al. Lesions of dorsolateral striatum preserve outcome expectancy but disrupt habit formation in instrumental learning , 2004, The European journal of neuroscience.
[24] Dylan A. Simon,et al. Model-based choices involve prospective neural activity , 2015, Nature Neuroscience.
[25] P. Dayan,et al. Model-based influences on humans’ choices and striatal prediction errors , 2011, Neuron.
[26] Peter Dayan,et al. A Neural Substrate of Prediction and Reward , 1997, Science.
[27] Marco Vasconcelos,et al. Irrational choice and the value of information , 2015, Scientific Reports.
[28] Thomas H. B. FitzGerald,et al. Disruption of Dorsolateral Prefrontal Cortex Decreases Model-Based in Favor of Model-free Control in Humans , 2013, Neuron.
[29] Roger Ratcliff,et al. Assessing model mimicry using the parametric bootstrap , 2004 .
[30] P. Dayan,et al. States versus Rewards: Dissociable Neural Prediction Error Signals Underlying Model-Based and Model-Free Reinforcement Learning , 2010, Neuron.
[31] S. Gershman,et al. Belief state representation in the dopamine system , 2018, Nature Communications.
[32] Jessica P. Stagner,et al. Maladaptive choice behaviour by pigeons: an animal analogue and possible mechanism for gambling (sub-optimal human decision-making behaviour) , 2011, Proceedings of the Royal Society B: Biological Sciences.
[33] F. Cushman,et al. Habitual control of goal selection in humans , 2015, Proceedings of the National Academy of Sciences.
[34] Nikos K. Logothetis,et al. Frontiers in Computational Neuroscience Computational Neuroscience , 2022 .
[35] Keiji Tanaka,et al. Object category structure in response patterns of neuronal population in monkey inferior temporal cortex. , 2007, Journal of neurophysiology.
[36] David S. Touretzky,et al. Representation and Timing in Theories of the Dopamine System , 2006, Neural Computation.
[37] Kenji Doya,et al. What are the computations of the cerebellum, the basal ganglia and the cerebral cortex? , 1999, Neural Networks.
[38] P. Dayan,et al. Goals and Habits in the Brain , 2013, Neuron.
[39] P. Dayan,et al. Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control , 2005, Nature Neuroscience.
[40] A. Markman,et al. Journal of Experimental Psychology : General Retrospective Revaluation in Sequential Decision Making : A Tale of Two Systems , 2012 .
[41] Richard S. Sutton,et al. Reinforcement Learning with Replacing Eligibility Traces , 2005, Machine Learning.
[42] Zeb Kurth-Nelson,et al. The modulation of savouring by prediction error and its effects on choice , 2016, eLife.
[43] Vivian V. Valentin,et al. Determining the Neural Substrates of Goal-Directed Learning in the Human Brain , 2007, The Journal of Neuroscience.
[44] Rajesh P. N. Rao. Decision Making Under Uncertainty: A Neural Model Based on Partially Observable Markov Decision Processes , 2010, Front. Comput. Neurosci..