Interrupting behaviour: Minimizing decision costs via temporal commitment and low-level interrupts

Ideal decision-makers should constantly assess all sources of information about opportunities and threats, and be able to redetermine their choices promptly in the face of change. However, perpetual monitoring and reassessment impose inordinate sensing and computational costs, making them impractical for animals and machines alike. The obvious alternative of committing for extended periods of time to limited sensory strategies associated with particular courses of action can be dangerous and wasteful. Here, we explore the intermediate possibility of making provisional temporal commitments whilst admitting interruption based on limited broader observation. We simulate foraging under threat of predation to elucidate the benefits of such a scheme. We relate our results to diseases of distractibility and roving attention, and consider mechanistic substrates such as noradrenergic neuromodulation.

[1]  Doina Precup,et al.  The Option-Critic Architecture , 2016, AAAI.

[2]  Stuart J. Russell,et al.  Do the right thing - studies in limited rationality , 1991 .

[3]  Stewart W. Wilson The animat path to AI , 1991 .

[4]  Leslie Pack Kaelbling,et al.  Hierarchical Learning in Stochastic Domains: Preliminary Results , 1993, ICML.

[5]  P. Dayan,et al.  Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control , 2005, Nature Neuroscience.

[6]  J. McCall,et al.  CRH Engagement of the Locus Coeruleus Noradrenergic System Mediates Stress-Induced Anxiety , 2015, Neuron.

[7]  W. Einhäuser,et al.  Pupil Dilation Signals Surprise: Evidence for Noradrenaline’s Role in Decision Making , 2011, Front. Neurosci..

[8]  R. Brafman,et al.  Contingent Planning via Heuristic Forward Search witn Implicit Belief States , 2005, ICAPS.

[9]  E. Charnov Optimal foraging, the marginal value theorem. , 1976, Theoretical population biology.

[10]  D. Blanchard,et al.  Ethoexperimental approaches to the biology of emotion. , 1988, Annual review of psychology.

[11]  Rodney A. Brooks,et al.  A Robust Layered Control Syste For A Mobile Robot , 2022 .

[12]  J. Schall,et al.  Neural Control of Voluntary Movement Initiation , 1996, Science.

[13]  P. Dayan,et al.  Tonic dopamine: opportunity costs and the control of response vigor , 2007, Psychopharmacology.

[14]  S. Sara,et al.  Activation of the noradrenergic system facilitates an attentional shift in the rat , 1990, Behavioural Brain Research.

[15]  Eric Horvitz,et al.  Reasoning about beliefs and actions under computational resource constraints , 1987, Int. J. Approx. Reason..

[16]  A. Houston,et al.  General results concerning the trade-off between gaining energy and avoiding predation , 1993 .

[17]  A. Houston,et al.  Models of adaptive behaviour , 1999 .

[18]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[19]  René Boel,et al.  Discrete event dynamic systems: Theory and applications. , 2002 .

[20]  Angela J. Yu,et al.  Phasic norepinephrine: A neural interrupt signal for unexpected events , 2006, Network.

[21]  Abraham Silberschatz,et al.  Operating Systems Concepts , 2005 .

[22]  Shlomo Zilberstein,et al.  Monitoring and control of anytime algorithms: A dynamic programming approach , 2001, Artif. Intell..

[23]  Jonathan D. Cohen,et al.  An integrative theory of locus coeruleus-norepinephrine function: adaptive gain and optimal performance. , 2005, Annual review of neuroscience.

[24]  Mark S. Boddy,et al.  An Analysis of Time-Dependent Planning , 1988, AAAI.

[25]  Mehdi Keramati,et al.  Homeostatic reinforcement learning for integrating reward collection and physiological stability , 2014, eLife.

[26]  Angela J. Yu,et al.  Uncertainty, Neuromodulation, and Attention , 2005, Neuron.

[27]  P. Corr,et al.  A two-dimensional neuropsychology of defense: fear/anxiety and defensive distance , 2004, Neuroscience & Biobehavioral Reviews.

[28]  J. Deutsch Perception and Communication , 1958, Nature.

[29]  James A. R. Marshall,et al.  Cross inhibition improves activity selection when switching incurs time costs , 2015 .

[30]  C. L. M. The Psychology of Attention , 1890, Nature.

[31]  Drew McDermott,et al.  Planning and Acting , 1978, Cogn. Sci..

[32]  Giovanni Pezzulo,et al.  The Mixed Instrumental Controller: Using Value of Information to Combine Habitual Choice and Mental Simulation , 2013, Front. Psychol..

[33]  David E. Wilkins,et al.  Recovering from execution errors in SIPE , 1985, Comput. Intell..

[34]  P. Dayan,et al.  Adaptive integration of habits into depth-limited planning defines a habitual-goal–directed spectrum , 2016, Proceedings of the National Academy of Sciences.

[35]  Michael J. Frank,et al.  Hold your horses: A dynamic computational role for the subthalamic nucleus in decision making , 2006, Neural Networks.

[36]  Sridhar Mahadevan,et al.  Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..

[37]  D. Blanchard,et al.  The characterization and modelling of antipredator defensive behavior , 1990, Neuroscience & Biobehavioral Reviews.

[38]  Shlomo Zilberstein,et al.  Operational Rationality through Compilation of Anytime Algorithms , 1995, AI Mag..

[39]  Allen Newell,et al.  Human Problem Solving. , 1973 .

[40]  S. Sara,et al.  Network reset: a simplified overarching theory of locus coeruleus noradrenaline function , 2005, Trends in Neurosciences.

[41]  Jonathan D. Cohen,et al.  The effect of atomoxetine on random and directed exploration in humans , 2017, PloS one.

[42]  Alasdair Houston,et al.  A positive feedback model for switching between two activities , 1985, Animal Behaviour.

[43]  D C Blanchard,et al.  Antipredator defensive behaviors in a visible burrow system. , 1989, Journal of comparative psychology.

[44]  Burt P. Kotler,et al.  Hazardous duty pay and the foraging cost of predation , 2004 .

[45]  T. Robbins,et al.  Cortical noradrenaline, attention and arousal , 1984, Psychological Medicine.

[46]  Earl D. Sacerdoti,et al.  Planning in a Hierarchy of Abstraction Spaces , 1974, IJCAI.

[47]  Lawrence M. Dill,et al.  The Economics of Fleeing from Predators , 1986 .

[48]  D. Stephens The logic of risk-sensitive foraging preferences , 1981, Animal Behaviour.

[49]  D. Kahneman Attention and Effort , 1973 .

[50]  Joel s. Brown,et al.  Vigilance, patch use and habitat selection: Foraging under predation risk , 1999 .

[51]  G. Koob,et al.  Corticotropin-releasing factor, norepinephrine, and stress , 1999, Biological Psychiatry.

[52]  Pattie Maes,et al.  Designing autonomous agents: Theory and practice from biology to engineering and back , 1990, Robotics Auton. Syst..

[53]  R. Bolles,et al.  Clinical implications of Bolles & Fanselow's pain/fear model , 1980, Behavioral and Brain Sciences.

[54]  Mark S. Gilzenrat,et al.  Pupil diameter tracks changes in control state predicted by the adaptive gain theory of locus coeruleus function , 2010, Cognitive, affective & behavioral neuroscience.

[55]  Doina Precup,et al.  Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[56]  S. L. Lima Stress and Decision Making under the Risk of Predation: Recent Developments from Behavioral, Reproductive, and Ecological Perspectives , 1998 .

[57]  G. D. Logan Task Switching , 2022 .

[58]  S. Sara The locus coeruleus and noradrenergic modulation of cognition , 2009, Nature Reviews Neuroscience.

[59]  Amir Dezfouli,et al.  Speed/Accuracy Trade-Off between the Habitual and the Goal-Directed Processes , 2011, PLoS Comput. Biol..

[60]  Robert C. Wilson,et al.  Rational regulation of learning dynamics by pupil–linked arousal systems , 2012, Nature Neuroscience.

[61]  MIKE WESTt,et al.  Bayesian Model Monitoring , 2010 .

[62]  M. Paulus,et al.  Neural systems underlying approach and avoidance in anxiety disorders. , 2010, Dialogues in clinical neuroscience.

[63]  B. Habibi,et al.  Pengi : An Implementation of A Theory of Activity , 1998 .

[64]  Ruben Vale,et al.  Rapid Spatial Learning Controls Instinctive Defensive Behavior in Mice , 2017, Current Biology.

[65]  Markus Knaden,et al.  Homing Ants Get Confused When Nest Cues Are Also Route Cues , 2017, Current Biology.

[66]  Michael J. Frank,et al.  Hold Your Horses: Impulsivity, Deep Brain Stimulation, and Medication in Parkinsonism , 2007, Science.

[67]  M. Paulus,et al.  Neural systems underlying approach and avoidance in anxiety disorders , 2010, Dialogues in clinical neuroscience.

[68]  D C Blanchard,et al.  Defensive behavior of laboratory and wild Rattus norvegicus. , 1986, Journal of comparative psychology.

[69]  Angela L. Duckworth,et al.  An opportunity cost model of subjective effort and task performance. , 2013, The Behavioral and brain sciences.

[70]  Jonathan D. Cohen,et al.  Phasic Activation of Monkey Locus Ceruleus Neurons by Simple Decisions in a Forced-Choice Task , 2004, The Journal of Neuroscience.

[71]  Jonathan D. Cohen,et al.  Toward a Rational and Mechanistic Account of Mental Effort. , 2017, Annual review of neuroscience.

[72]  S. Monsell,et al.  Costs of a predictible switch between simple cognitive tasks. , 1995 .

[73]  S. L. Lima,et al.  Behavioral decisions made under the risk of predation: a review and prospectus , 1990 .

[74]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[75]  N. Daw,et al.  Deciding How To Decide: Self-Control and Meta-Decision Making , 2015, Trends in Cognitive Sciences.

[76]  Shlomo Zilberstein,et al.  Models of Bounded Rationality , 1995 .

[77]  Joseph T. McGuire,et al.  Decision making and the avoidance of cognitive demand. , 2010, Journal of experimental psychology. General.

[78]  James A. R. Marshall,et al.  Mammalian choices: combining fast-but-inaccurate and slow-but-accurate decision-making systems , 2008, Proceedings of the Royal Society B: Biological Sciences.