Prospective Optimization

Human performance approaches that of an ideal observer and optimal actor in some perceptual and motor tasks. These optimal abilities depend on the capacity of the cerebral cortex to store an immense amount of information and to flexibly make rapid decisions. However, behavior only approaches these limits after a long period of learning while the cerebral cortex interacts with the basal ganglia, an ancient part of the vertebrate brain that is responsible for learning sequences of actions directed toward achieving goals. Progress has been made in understanding the algorithms used by the brain during reinforcement learning, which is an online approximation of dynamic programming. Humans also make plans that depend on past experience by simulating different scenarios, which is called prospective optimization. The same brain structures in the cortex and basal ganglia that are active online during optimal behavior are also active offline during prospective optimization. The emergence of general principles and algorithms for goal-directed behavior has consequences for the development of autonomous devices in engineering applications.

[1]  Michael S Landy,et al.  Statistical decision theory and the selection of rapid, goal-directed movements. , 2003, Journal of the Optical Society of America. A, Optics, image science, and vision.

[2]  S. Hecht,et al.  ENERGY, QUANTA, AND VISION , 1942, The Journal of general physiology.

[3]  O. Hikosaka,et al.  Lateral habenula as a source of negative reward signals in dopamine neurons , 2007, Nature.

[4]  V. Brown,et al.  Rodent models of prefrontal cortical function , 2002, Trends in Neurosciences.

[5]  M. Landy,et al.  Optimal Compensation for Changes in Task-Relevant Movement Variability , 2005, The Journal of Neuroscience.

[6]  Peter Brown,et al.  Patterns of Bidirectional Communication between Cortex and Basal Ganglia during Movement in Patients with Parkinson Disease , 2008, The Journal of Neuroscience.

[7]  R. Jacobs What determines visual cue reliability? , 2002, Trends in Cognitive Sciences.

[8]  J. Snider Optimal random search for a single hidden target. , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.

[9]  R. Benoit,et al.  A Neural Mechanism Mediating the Impact of Episodic Prospection on Farsighted Decisions , 2011, The Journal of Neuroscience.

[10]  Dennis Gabor,et al.  Theory of communication , 1946 .

[11]  Rajesh P. N. Rao Bayesian Computation in Recurrent Neural Circuits , 2004, Neural Computation.

[12]  W. Geisler Visual perception and the statistical properties of natural scenes. , 2008, Annual review of psychology.

[13]  D. Schacter,et al.  Remembering the past to imagine the future: the prospective brain , 2007, Nature Reviews Neuroscience.

[14]  T. Sejnowski,et al.  Learning where to look for a hidden target , 2013, Proceedings of the National Academy of Sciences.

[15]  S. Gepshtein,et al.  Optimality of human movement under natural variations of visual-motor uncertainty. , 2007, Journal of vision.

[16]  Terrence J Sejnowski,et al.  Learning optimal strategies in complex environments , 2010, Proceedings of the National Academy of Sciences.

[17]  Sten Grillner,et al.  Evolution of the basal ganglia: Dual‐output pathways conserved throughout vertebrate phylogeny , 2012, The Journal of comparative neurology.

[18]  J. Feldman,et al.  Information along contours and object boundaries. , 2005, Psychological review.

[19]  T. Robbins,et al.  Choosing between Small, Likely Rewards and Large, Unlikely Rewards Activates Inferior and Orbital Prefrontal Cortex , 1999, The Journal of Neuroscience.

[20]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[21]  Laurence T. Maloney,et al.  Statistical Decision Theory and Biological Vision , 2005 .

[22]  H. Poizner,et al.  Probabilistic reversal learning is impaired in Parkinson's disease , 2009, Neuroscience.

[23]  Sean R Eddy,et al.  What is dynamic programming? , 2004, Nature Biotechnology.

[24]  Naftali Tishby,et al.  Dopaminergic Balance between Reward Maximization and Policy Complexity , 2011, Front. Syst. Neurosci..

[25]  P. Palange,et al.  From the authors , 2007, European Respiratory Journal.

[26]  Simon Haykin,et al.  On Cognitive Dynamic Systems: Cognitive Neuroscience and Engineering Learning From Each Other , 2014, Proceedings of the IEEE.

[27]  T. Ono,et al.  Effects of reward anticipation, reward presentation, and spatial parameters on the firing of single neurons recorded in the subiculum and nucleus accumbens of freely moving rats , 2000, Behavioural Brain Research.

[28]  Eero P. Simoncelli Optimal Estimation in Sensory Systems , 2009 .

[29]  P. Glimcher,et al.  The neural correlates of subjective value during intertemporal choice , 2007, Nature Neuroscience.

[30]  Aaas News,et al.  Book Reviews , 1893, Buffalo Medical and Surgical Journal.

[31]  Daphna Joel,et al.  The orbital cortex in rats topographically projects to central parts of the caudate–putamen complex , 2008, Neuroscience Letters.

[32]  H. Yin,et al.  The role of the basal ganglia in habit formation , 2006, Nature Reviews Neuroscience.

[33]  P. Glimcher,et al.  An "as soon as possible" effect in human intertemporal decision making: behavioral evidence and neural mechanisms. , 2010, Journal of neurophysiology.

[34]  A C Roberts,et al.  Primate analogue of the Wisconsin Card Sorting Test: effects of excitotoxic lesions of the prefrontal cortex in the marmoset. , 1996, Behavioral neuroscience.

[35]  Adriano B. L. Tort,et al.  Dynamic cross-frequency couplings of local field potential oscillations in rat striatum and hippocampus during performance of a T-maze task , 2008, Proceedings of the National Academy of Sciences.

[36]  T. Robbins,et al.  Dissociable Forms of Inhibitory Control within Prefrontal Cortex with an Analog of the Wisconsin Card Sort Test: Restriction to Novel Situations and Independence from “On-Line” Processing , 1997, The Journal of Neuroscience.

[37]  W. Schultz,et al.  Influence of Reward Delays on Responses of Dopamine Neurons , 2008, The Journal of Neuroscience.

[38]  B. McNaughton,et al.  The Ventral Striatum in Off-Line Processing: Ensemble Reactivation during Sleep and Modulation by Hippocampal Ripples , 2004, The Journal of Neuroscience.

[39]  H. Damasio,et al.  Humans and great apes share a large frontal cortex , 2002, Nature Neuroscience.

[40]  B. Tatler,et al.  The prominence of behavioural biases in eye guidance , 2009 .

[41]  E. Vaadia,et al.  Midbrain dopamine neurons encode decisions for future action , 2006, Nature Neuroscience.

[42]  Simon Haykin,et al.  Cognitive Dynamic Systems: The perception–action cycle , 2012 .

[43]  W. Geisler Sequential ideal-observer analysis of visual discriminations. , 1989 .

[44]  M. Landy,et al.  Measurement and modeling of depth cue combination: in defense of weak fusion , 1995, Vision Research.

[45]  Andrew J Lees,et al.  Intact Reward Learning but Elevated Delay Discounting in Parkinson's Disease Patients With Impulsive-Compulsive Spectrum Behaviors , 2010, Neuropsychopharmacology.

[46]  T. Sejnowski,et al.  A critique of pure vision , 1993 .

[47]  R. Ptak,et al.  Decision-making in amnesia: Do advantageous decisions require conscious knowledge of previous behavioural choices? , 2006, Neuropsychologia.

[48]  D. Hassabis,et al.  Using Imagination to Understand the Neural Basis of Episodic Memory , 2007, The Journal of Neuroscience.

[49]  Koji Jimura,et al.  Are people really more patient than other animals? Evidence from human discounting of real liquid rewards , 2009, Psychonomic bulletin & review.

[50]  Richard S. Sutton,et al.  Temporal credit assignment in reinforcement learning , 1984 .

[51]  P. Dayan,et al.  A framework for mesencephalic dopamine systems based on predictive Hebbian learning , 1996, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[52]  S. Schultz Principles of Neural Science, 4th ed. , 2001 .

[53]  H. Barlow,et al.  Limit to the detection of Glass patterns in the presence of noise. , 1987, Journal of the Optical Society of America. A, Optics and image science.

[54]  Gerald Tesauro,et al.  Temporal difference learning and TD-Gammon , 1995, CACM.

[55]  J. Feldman,et al.  Bayes and the Simplicity Principle in Perception Simplicity versus Likelihood Principles in Perception , 2022 .

[56]  Richard S. Sutton,et al.  Integrated Modeling and Control Based on Reinforcement Learning and Dynamic Programming , 1990, NIPS 1990.

[57]  K. Doya,et al.  Parallel Cortico-Basal Ganglia Mechanisms for Acquisition and Execution of Visuomotor SequencesA Computational Approach , 2001, Journal of Cognitive Neuroscience.

[58]  D. Schacter,et al.  Hippocampal contributions to the episodic simulation of specific and general future events , 2011, Hippocampus.

[59]  R. Shadmehr,et al.  Temporal Discounting of Reward and the Cost of Time in Motor Control , 2010, The Journal of Neuroscience.

[60]  Peter Dayan,et al.  A Neural Substrate of Prediction and Reward , 1997, Science.

[61]  Adam Johnson,et al.  Neural Ensembles in CA3 Transiently Encode Paths Forward of the Animal at a Decision Point , 2007, The Journal of Neuroscience.

[62]  V. Brown,et al.  Medial Frontal Cortex Mediates Perceptual Attentional Set Shifting in the Rat , 2000, The Journal of Neuroscience.

[63]  Pieter Abbeel,et al.  Autonomous Helicopter Aerobatics through Apprenticeship Learning , 2010, Int. J. Robotics Res..

[64]  Boris Suchan,et al.  Foreseeing the future: Occurrence probability of imagined future events modulates hippocampal activation , 2009, Hippocampus.

[65]  Simon Haykin,et al.  Cognitive Dynamic Systems: Perception-action Cycle, Radar and Radio , 2012 .

[66]  V. Voon,et al.  Medication-related impulse control and repetitive behaviors in Parkinson disease. , 2007, Archives of neurology.

[67]  P. Boyer Evolutionary economics of mental time travel? , 2008, Trends in Cognitive Sciences.

[68]  L. Green,et al.  A discounting framework for choice with delayed and probabilistic rewards. , 2004, Psychological bulletin.

[69]  A G Barto,et al.  Toward a modern theory of adaptive networks: expectation and prediction. , 1981, Psychological review.

[70]  R. Cools Dopaminergic modulation of cognitive function-implications for l-DOPA treatment in Parkinson's disease , 2006, Neuroscience & Biobehavioral Reviews.

[71]  Steven Mithen Key changes in the evolution of human psychology , 2007 .

[72]  M. Ernst,et al.  Humans integrate visual and haptic information in a statistically optimal fashion , 2002, Nature.

[73]  H Barlow,et al.  Redundancy reduction revisited , 2001, Network.

[74]  W. Epstein,et al.  The status of the minimum principle in the theoretical analysis of visual perception. , 1985, Psychological bulletin.

[75]  W. Marsden I and J , 2012 .

[76]  David J. Foster,et al.  Reverse replay of behavioural sequences in hippocampal place cells during the awake state , 2006, Nature.

[77]  L. Maloney,et al.  Decision-theoretic models of visual perception and action , 2010, Vision Research.

[78]  L. Stein,et al.  Self-stimulation in the mesencephalic trajectory of the ventral noradrenergic bundle. , 1974, Brain research.

[79]  Jan Peters,et al.  Episodic Future Thinking Reduces Reward Delay Discounting through an Enhancement of Prefrontal-Mediotemporal Interactions , 2010, Neuron.

[80]  G. Buzsáki,et al.  Forward and reverse hippocampal place-cell sequences during ripples , 2007, Nature Neuroscience.

[81]  D. Burr,et al.  The Ventriloquist Effect Results from Near-Optimal Bimodal Integration , 2004, Current Biology.

[82]  H. Bergman,et al.  Goal-directed and habitual control in the basal ganglia: implications for Parkinson's disease , 2010, Nature Reviews Neuroscience.

[83]  T. Bliss,et al.  The Hippocampus Book , 2006 .

[84]  W. Geisler Ownership: a review , 1993 .

[85]  R. Rescorla A theory of pavlovian conditioning: The effectiveness of reinforcement and non-reinforcement , 1972 .

[86]  S. Gepshtein,et al.  Viewing Geometry Determines How Vision and Haptics Combine in Size Perception , 2003, Current Biology.

[87]  M. Chun,et al.  Contextual Cueing: Implicit Learning and Memory of Visual Context Guides Spatial Attention , 1998, Cognitive Psychology.