Optimality in motor control

Several models of human motor control propose that stereotypical movements like reaching are optimal, in the sense that they minimize a performance cost function. However, from a computational point of view, optimal control is a hard problem. First, most natural bio-mechanical systems are redundant—there are more degrees of freedom than required. Second, optimal control tasks often suffer from the "curse of dimensionality". For example, consider moving a two-joint arm from location A to location B in 100 time steps. If torques are discretized to one of 10 values, then there are 10200 possible torque sequences. How do we efficiently search a space of that size? In this thesis we try to answer some of these questions arising from a theory of optimal control. First, if optimal control is hard, how do humans accurately solve day to day control problems? We propose that for stereotypical movements like reaching, the optimal control solution can be decomposed as a sum of scaled and time shifted components, or "motor synergies". In one project, we discover these synergies through dimensionality reduction. Near-optimal control is achieved by linearly combining the synergies we discovered. In a second project, synergies are solutions to representative motor tasks. We show that libraries of such synergies provide overcomplete representations of the space of motor solutions. Control solutions for new tasks are found through a greedy additive regression procedure that tends to linearly combine a small number of synergies from the library. Second, we conducted experiments to show that humans are able to perform well even under certain non-stereotypical control regimes. In these experiments, subjects were required to control a dynamical system corrupted with noise. A comparison of human performance to the theoretically optimal solution shows that humans reach close to optimal performance under a variety of noise conditions. Additionally, we showed that subjects adapt to the noise regime rather than using a fixed controller across different noise conditions. Finally, we propose a theoretical learning model inspired from behavioral shaping, which is an incremental training procedure commonly used to teach complex behaviors. In shaping, the learner is rewarded for performing successively difficult tasks that monotonically provide better approximations to the target task. We formulate shaping in the context of a search problem. In a search problem, the learner has to use a membership oracle of a concept to find one positive instance of the concept. In a shaped search problem, the learner has access to a sequence of membership oracles of increasingly restrictive concepts leading to the final concept. For the concept class of intervals, we show that the search problem can be solved with an exponentially smaller number of samples if a such a sequence is used. Additionally, we prove that convexity of concepts is important for shaping to be successful.

[1]  Daniel M. Wolpert,et al.  Forward Models for Physiological Motor Control , 1996, Neural Networks.

[2]  Jessica K. Hodgins,et al.  Synthesizing physically realistic human motion in low-dimensional, behavior-specific spaces , 2004, SIGGRAPH 2004.

[3]  Robert A. Jacobs,et al.  Behavioral Shaping for Geometric Concepts , 2007, J. Mach. Learn. Res..

[4]  Reza Shadmehr,et al.  Learning of action through adaptive combination of motor primitives , 2000, Nature.

[5]  J R Flanagan,et al.  The Role of Internal Models in Motion Planning and Control: Evidence from Grip Force Adjustments during Movements of Hand-Held Loads , 1997, The Journal of Neuroscience.

[6]  Terrence J. Sejnowski,et al.  Learning Overcomplete Representations , 2000, Neural Computation.

[7]  Christopher G. Atkeson,et al.  Learning from observation using primitives , 2001, Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164).

[8]  David J. Field,et al.  Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[9]  A. S. Manne Linear Programming and Sequential Decisions , 1960 .

[10]  Christopher G. Atkeson,et al.  Policies based on trajectory libraries , 2006, Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006..

[11]  J. Krakauer,et al.  Adaptation to Visuomotor Transformations: Consolidation, Interference, and Forgetting , 2005, The Journal of Neuroscience.

[12]  Jessica K. Hodgins,et al.  Interactive control of avatars animated with human motion data , 2002, SIGGRAPH.

[13]  Jun Nakanishi,et al.  Learning Attractor Landscapes for Learning Motor Primitives , 2002, NIPS.

[14]  M. Kawato,et al.  A hierarchical neural-network model for control and learning of voluntary movement , 2004, Biological Cybernetics.

[15]  Jeff G. Schneider,et al.  Robot skill learning, basis functions, and control regimes , 1993, [1993] Proceedings IEEE International Conference on Robotics and Automation.

[16]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[17]  A. Hordijk,et al.  Linear Programming and Markov Decision Chains , 1979 .

[18]  Emilio Bizzi,et al.  Combinations of muscle synergies in the construction of a natural motor behavior , 2003, Nature Neuroscience.

[19]  Scott T. Grafton,et al.  Forward modeling allows feedback control for fast reaching movements , 2000, Trends in Cognitive Sciences.

[20]  E. J. Cheng,et al.  Measured and modeled properties of mammalian skeletal muscle. II. The effectsof stimulus frequency on force-length and force-velocity relationships , 1999, Journal of Muscle Research & Cell Motility.

[21]  Daniel M. Wolpert,et al.  Signal-dependent noise determines motor planning , 1998, Nature.

[22]  Maja J. Mataric,et al.  Automated Derivation of Primitives for Movement Classification , 2000, Auton. Robots.

[23]  Michael S. Lewicki,et al.  Efficient auditory coding , 2006, Nature.

[24]  R. Miall,et al.  Digital Object Identifier (DOI) 10.1007/s002219900286 RESEARCH ARTICLE , 2022 .

[25]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[26]  Maja J. Mataric,et al.  A spatio-temporal extension to Isomap nonlinear dimension reduction , 2004, ICML.

[27]  Zoubin Ghahramani,et al.  Unsupervised learning of sensory-motor primitives , 2003, Proceedings of the 25th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (IEEE Cat. No.03CH37439).

[28]  T. Flash,et al.  The coordination of arm movements: an experimentally confirmed mathematical model , 1985, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[29]  Emanuel Todorov,et al.  Iterative Linear Quadratic Regulator Design for Nonlinear Biological Movement Systems , 2004, ICINCO.

[30]  Stéphane Mallat,et al.  Matching pursuits with time-frequency dictionaries , 1993, IEEE Trans. Signal Process..

[31]  William H. Press,et al.  Numerical recipes in C++: the art of scientific computing, 2nd Edition (C++ ed., print. is corrected to software version 2.10) , 1994 .

[32]  Michael I. Jordan,et al.  An internal model for sensorimotor integration. , 1995, Science.

[33]  Jeff Schneider,et al.  Robot Skill Learning and the Effects of Basis Function Choice , 1992 .

[34]  N. A. Bernshteĭn The co-ordination and regulation of movements , 1967 .

[35]  E. Todorov,et al.  Analysis of the synergies underlying complex hand manipulation , 2004, The 26th Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[36]  J. Jing,et al.  The Construction of Movement with Behavior-Specific and Behavior-Independent Modules , 2004, The Journal of Neuroscience.

[37]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[38]  Peter I. Corke,et al.  A robotics toolbox for MATLAB , 1996, IEEE Robotics Autom. Mag..

[39]  W. Geisler Sequential ideal-observer analysis of visual discriminations. , 1989 .

[40]  E. Todorov,et al.  A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic systems , 2005, Proceedings of the 2005, American Control Conference, 2005..

[41]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[42]  Manfred Lau,et al.  Behavior planning for character animation , 2005, SCA '05.

[43]  Terence D. Sanger Optimal Movement Primitives , 1994, NIPS.

[44]  James Theiler,et al.  Grafting: Fast, Incremental Feature Selection by Gradient Descent in Function Space , 2003, J. Mach. Learn. Res..

[45]  Filson H. Glanz,et al.  Application of a General Learning Algorithm to the Control of Robotic Manipulators , 1987 .

[46]  Andrew W. Moore,et al.  Locally Weighted Learning for Control , 1997, Artificial Intelligence Review.

[47]  R S Johansson,et al.  Sensory input and control of grip. , 1998, Novartis Foundation symposium.

[48]  Emilio Bizzi,et al.  Shared and specific muscle synergies in natural motor behaviors. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[49]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[50]  P. Matthews Relationship of firing intervals of human motor units to the trajectory of post‐spike after‐hyperpolarization and synaptic noise. , 1996, The Journal of physiology.

[51]  M. Posner Foundations of cognitive science , 1989 .

[52]  R. Schapire The Strength of Weak Learnability , 1990, Machine Learning.

[53]  H B BARLOW,et al.  Increment thresholds at low intensities considered as signal/noise discriminations , 1957, The Journal of physiology.

[54]  Robert A. Jacobs,et al.  Developmental Constraints Aid the Acquisition of Binocular Disparity Sensitivities , 2003, Neural Computation.

[55]  John M. Hollerbach,et al.  Dynamic interactions between limb segments during planar arm movement , 1982, Biological Cybernetics.

[56]  Robert A. Jacobs,et al.  A Developmental Approach Aids Motor Learning , 2003, Neural Computation.

[57]  Michael I. Jordan,et al.  Optimal feedback control as a theory of motor coordination , 2002, Nature Neuroscience.

[58]  F A Mussa-Ivaldi,et al.  Adaptive representation of dynamics during learning of a motor task , 1994, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[59]  E. Todorov Optimality principles in sensorimotor control , 2004, Nature Neuroscience.

[60]  Sascha E Engelbrecht,et al.  The Undershoot Bias , 2003, Psychological science.

[61]  Robert A. Jacobs,et al.  Properties of Synergies Arising from a Theory of Optimal Motor Behavior , 2006, Neural Computation.

[62]  Stefan Schaal,et al.  Reinforcement Learning for Parameterized Motor Primitives , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[63]  A. Barto,et al.  Approximate optimal control as a model for motor learning. , 2005, Psychological review.

[64]  Robert A Jacobs,et al.  Near-Optimal Human Adaptive Control across Different Noise Environments , 2006, The Journal of Neuroscience.

[65]  Gabriel Robles-De-La-Torre,et al.  Numerically estimating internal models of dynamic virtual objects , 2004, TAP.

[66]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[67]  Peter Buhlmann Boosting Methods: Why They Can Be Us eful for High-Dimensional Data , 2003 .