论文信息 - Autonomous reinforcement of behavioral sequences in neural dynamics

Autonomous reinforcement of behavioral sequences in neural dynamics

We introduce a dynamic neural algorithm called Dynamic Neural (DN) SARSA(λ) for learning a behavioral sequence from delayed reward. DN-SARSA(λ) combines Dynamic Field Theory models of behavioral sequence representation, classical reinforcement learning, and a computational neuroscience model of working memory, called Item and Order working memory, which serves as an eligibility trace. DN-SARSA(λ) is implemented on both a simulated and real robot that must learn a specific rewarding sequence of elementary behaviors from exploration. Results show DN-SARSA(λ) performs on the level of the discrete SARSA(λ), validating the feasibility of general reinforcement learning without compromising neural dynamics.

[1] Peter Dayan,et al. A Neural Substrate of Prediction and Reward , 1997, Science.

[2] S. Amari. Dynamics of pattern formation in lateral-inhibition type neural fields , 1977, Biological Cybernetics.

[3] Stephen Grossberg,et al. Neural dynamics of speech perception: Phonemic restoration in noise using subsequent context. , 2009 .

[4] Gregor Schöner,et al. A neural-dynamic architecture for behavioral organization of an embodied agent , 2011, 2011 IEEE International Conference on Development and Learning (ICDL).

[5] Gregor Sch,et al. Dynamical Systems Approaches to Cognition , 2008 .

[6] M. Kawato,et al. Efficient reinforcement learning: computational theories, neuroscience and robotics , 2007, Current Opinion in Neurobiology.

[7] Stephan K. U. Zibner,et al. Scenes and tracking with dynamic neural fields: How to update a robotic scene representation , 2010, 2010 IEEE 9th International Conference on Development and Learning.

[8] Jürgen Schmidhuber,et al. Formal Theory of Creativity, Fun, and Intrinsic Motivation (1990–2010) , 2010, IEEE Transactions on Autonomous Mental Development.

[9] G. Schöner,et al. Dynamic Field Theory of Movement Preparation , 2022 .

[10] E. Thelen,et al. Using dynamic field theory to rethink infant habituation. , 2006, Psychological review.

[11] Gregor Schöner,et al. A robotic architecture for action selection and behavioral organization inspired by human cognition , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[12] Christian Faubel,et al. Learning to recognize objects on the fly: A neurally based dynamic field approach , 2008, Neural Networks.

[13] Jonathan D. Cohen,et al. Learning to Use Working Memory in Partially Observable Environments through Dopaminergic Reinforcement , 2008, NIPS.

[14] Gregor Schöner,et al. Saccadic motor planning by integrating visual information and pre-information on neural dynamic fields , 1995, Biological Cybernetics.

[15] Gregor Schöner,et al. A Dynamic Neural Field Theory of Multi-Item Visual Working Memory and Change Detection , 2006 .

[16] T. Sejnowski,et al. A Computational Model of How the Basal Ganglia Produce Sequences , 1998, Journal of Cognitive Neuroscience.

[17] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[18] Stephex GROSSBERGl. Behavioral Contrast in Short Term Memory : Serial Binary Memory Models or Parallel Continuous Memory Models ? , 2003 .

[19] Gregor Schöner,et al. An embodied account of serial order: How instabilities drive sequence generation , 2010, Neural Networks.

[20] G. Schöner. The Cambridge Handbook of Computational Psychology: Dynamical Systems Approaches to Cognition , 2008 .

[21] Stephen Grossberg,et al. Laminar cortical dynamics of conscious speech perception: neural model of phonemic restoration using subsequent context in noise. , 2011, The Journal of the Acoustical Society of America.

[22] Peter Dayan,et al. Bee foraging in uncertain environments using predictive hebbian learning , 1995, Nature.

[23] Stephen Grossberg,et al. How laminar frontal cortex and basal ganglia circuits interact to control planned and reactive saccades , 2004, Neural Networks.

[24] Jeffrey S. Johnson,et al. The Dynamic Field Theory and Embodied Cognitive Dynamics , 2008 .

[25] Estela Bicho,et al. The dynamic approach to autonomous robotics demonstrated on a low-level vehicle platform , 1997, Robotics Auton. Syst..

[26] E. Thelen,et al. The dynamics of embodiment: A field theory of infant perseverative reaching , 2001, Behavioral and Brain Sciences.

[27] Stephen Grossberg,et al. A Theory of Human Memory: Self-Organization and Performance of Sensory-Motor Codes, Maps, and Plans , 1982 .