Reinforcement learning, efficient coding, and the statistics of natural tasks

The application of ideas from computational reinforcement learning has recently enabled dramatic advances in behavioral and neuroscientific research. For the most part, these advances have involved insights concerning the algorithms underlying learning and decision making. In the present article, we call attention to the equally important but relatively neglected question of how problems in learning and decision making are internally represented. To articulate the significance of representation for reinforcement learning we draw on the concept of efficient coding, as developed in perception research. The resulting perspective exposes a range of novel goals for behavioral and neuroscientific research, highlighting in particular the need for research into the statistical structure of naturalistic tasks.

[1]  Andrew G. Barto,et al.  Skill Characterization Based on Betweenness , 2008, NIPS.

[2]  Jeffrey M. Zacks,et al.  Event perception: a mind-brain perspective. , 2007, Psychological bulletin.

[3]  Daniel A. Braun,et al.  Structure learning in action , 2010, Behavioural Brain Research.

[4]  Reza Shadmehr,et al.  Learning of action through adaptive combination of motor primitives , 2000, Nature.

[5]  Michael I. Jordan,et al.  Optimal feedback control as a theory of motor coordination , 2002, Nature Neuroscience.

[6]  Y. Niv,et al.  Learning latent structure: carving nature at its joints , 2010, Current Opinion in Neurobiology.

[7]  Robert C. Wilson,et al.  Reinforcement Learning in Multidimensional Environments Relies on Attention Mechanisms , 2015, The Journal of Neuroscience.

[8]  Kevin Murphy,et al.  What’s Cookin’? Interpreting Cooking Videos using Text, Speech and Vision , 2015, NAACL.

[9]  Sridhar Mahadevan,et al.  Proto-value Functions: A Laplacian Framework for Learning Representation and Control in Markov Decision Processes , 2007, J. Mach. Learn. Res..

[10]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[11]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[12]  N. Daw,et al.  Human Reinforcement Learning Subdivides Structured Action Spaces by Learning Effector-Specific Values , 2009, The Journal of Neuroscience.

[13]  M. Botvinick Effects of domain-specific knowledge on memory for serial order , 2005, Cognition.

[14]  Chrystopher L. Nehaniv,et al.  Hierarchical Behaviours: Getting the Most Bang for Your Bit , 2009, ECAL.

[15]  Saul Amarel,et al.  On representations of problems of reasoning about actions , 1968 .

[16]  Alicia P. Wolfe,et al.  Paying attention to what matters: observation abstraction in partially observable environments , 2010 .

[17]  Jun Nakanishi,et al.  Dynamical Movement Primitives: Learning Attractor Models for Motor Behaviors , 2013, Neural Computation.

[18]  M. Graziano The organization of behavioral repertoire in motor cortex. , 2006, Annual review of neuroscience.

[19]  Nathaniel D. Daw,et al.  Environmental statistics and the trade-off between model-based and TD learning in humans , 2011, NIPS.

[20]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[21]  M. Hayhoe,et al.  In what ways do eye movements contribute to everyday activities? , 2001, Vision Research.

[22]  P. Dayan,et al.  Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control , 2005, Nature Neuroscience.

[23]  Jonathan D. Cohen,et al.  The Computational and Neural Basis of Cognitive Control: Charted Territory and New Frontiers , 2014, Cogn. Sci..

[24]  Robert A. Jacobs,et al.  The Adaptive Nature of Visual Working Memory , 2014 .

[25]  Peter Dayan,et al.  Interplay of approximate planning strategies , 2015, Proceedings of the National Academy of Sciences.

[26]  Lori L Holt,et al.  Efficient coding in human auditory perception. , 2009, The Journal of the Acoustical Society of America.

[27]  H. Sompolinsky,et al.  Compressed sensing, sparsity, and dimensionality in neuronal information processing and data analysis. , 2012, Annual review of neuroscience.

[28]  N. Daw,et al.  Generalization of value in reinforcement learning by humans , 2012, The European journal of neuroscience.

[29]  Sridhar Mahadevan,et al.  Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..

[30]  Bruno Castro da Silva,et al.  Learning parameterized motor skills on a humanoid robot , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[31]  O. Hikosaka Models of information processing in the basal Ganglia edited by James C. Houk, Joel L. Davis and David G. Beiser, The MIT Press, 1995. $60.00 (400 pp) ISBN 0 262 08234 9 , 1995, Trends in Neurosciences.

[32]  D. Plaut,et al.  Doing without schema hierarchies: a recurrent connectionist approach to normal and impaired routine sequential action. , 2004, Psychological review.

[33]  M. Botvinick,et al.  Hierarchically organized behavior and its neural foundations: A reinforcement learning perspective , 2009, Cognition.

[34]  M. Botvinick,et al.  Neural representations of events arise from temporal community structure , 2013, Nature Neuroscience.

[35]  Eero P. Simoncelli,et al.  Natural image statistics and neural representation. , 2001, Annual review of neuroscience.

[36]  三嶋 博之 The theory of affordances , 2008 .

[37]  Richard N Aslin,et al.  Bayesian learning of visual chunks by human observers , 2008, Proceedings of the National Academy of Sciences.

[38]  F. Attneave Some informational aspects of visual perception. , 1954, Psychological review.

[39]  Sang Wan Lee,et al.  The structure of reinforcement-learning mechanisms in the human brain , 2015, Current Opinion in Behavioral Sciences.

[40]  Alec Solway,et al.  Optimal Behavioral Hierarchy , 2014, PLoS Comput. Biol..

[41]  D. Ballard,et al.  Eye movements in natural behavior , 2005, Trends in Cognitive Sciences.

[42]  H. B. Barlow,et al.  Possible Principles Underlying the Transformations of Sensory Messages , 2012 .

[43]  R. Saxe,et al.  A Noisy-Channel Account of Crosslinguistic Word-Order Variation , 2013, Psychological science.

[44]  W. Geisler Visual perception and the statistical properties of natural scenes. , 2008, Annual review of psychology.

[45]  A MichaelS. Ethologically Relevant Movements Mapped on the Motor Cortex , 2010 .

[46]  Mikhail Belkin,et al.  The network properties of episodic graphs , 2010 .

[47]  H Barlow,et al.  Redundancy reduction revisited , 2001, Network.

[48]  A. Graybiel,et al.  Representation of Action Sequence Boundaries by Macaque Prefrontal Cortical Neurons , 2003, Science.

[49]  John Duncan,et al.  Hierarchical coding for sequential task events in the monkey prefrontal cortex , 2008, Proceedings of the National Academy of Sciences.

[50]  Emilio Bizzi,et al.  Combinations of muscle synergies in the construction of a natural motor behavior , 2003, Nature Neuroscience.

[51]  B. Triggs,et al.  Tracking Articulated Motion with Piecewise Learned Dynamical Models , 2004 .

[52]  Peter Dayan,et al.  A Neural Substrate of Prediction and Reward , 1997, Science.

[53]  Balaraman Ravindran,et al.  Model Minimization in Hierarchical Reinforcement Learning , 2002, SARA.

[54]  Blake S. Porter,et al.  Hippocampal Representation of Related and Opposing Memories Develop within Distinct, Hierarchically Organized Neural Schemas , 2014, Neuron.

[55]  Samuel Gershman,et al.  Design Principles of the Hippocampal Cognitive Map , 2014, NIPS.

[56]  John Duncan,et al.  Hierarchical Organization of Cognition Reflected in Distributed Frontoparietal Activity , 2012, The Journal of Neuroscience.

[57]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[58]  Roger C. Schank,et al.  SCRIPTS, PLANS, GOALS, AND UNDERSTANDING , 1988 .

[59]  Peter Dayan,et al.  Structure in the Space of Value Functions , 2002, Machine Learning.

[60]  Marc G. Bellemare,et al.  Compress and Control , 2015, AAAI.

[61]  Fabian A. Soto,et al.  Explaining compound generalization in associative and causal learning through rational principles of dimensional generalization. , 2014, Psychological review.

[62]  Leslie Pack Kaelbling,et al.  Constructing Symbolic Representations for High-Level Planning , 2014, AAAI.

[63]  Anthony Davids,et al.  Ecological Psychology: Concepts and Methods for Studying the Environment of Human Behavior. , 1969 .

[64]  Joseph J Atick,et al.  Could information theory provide an ecological theory of sensory processing? , 2011, Network.

[65]  Marvin Minsky,et al.  Steps toward Artificial Intelligence , 1995, Proceedings of the IRE.

[66]  C. Gilbert,et al.  On a common circle: natural scenes and Gestalt rules. , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[67]  A. Barto Adaptive Critics and the Basal Ganglia , 1995 .

[68]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[69]  Daniel Polani,et al.  Information Theory of Decisions and Actions , 2011 .

[70]  David J. Field,et al.  Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.