论文信息 - Learning to Combine Motor Primitives Via Greedy Additive Regression

Learning to Combine Motor Primitives Via Greedy Additive Regression

The computational complexities arising in motor control can be ameliorated through the use of a library of motor synergies. We present a new model, referred to as the Greedy Additive Regression (GAR) model, for learning a library of torque sequences, and for learning the coefficients of a linear combination of sequences minimizing a cost function. From the perspective of numerical optimization, the GAR model is interesting because it creates a library of "local features"---each sequence in the library is a solution to a single training task---and learns to combine these sequences using a local optimization procedure, namely, additive regression. We speculate that learners with local representational primitives and local optimization procedures will show good performance on nonlinear tasks. The GAR model is also interesting from the perspective of motor control because it outperforms several competing models. Results using a simulated two-joint arm suggest that the GAR model consistently shows excellent performance in the sense that it rapidly learns to perform novel, complex motor tasks. Moreover, its library is overcomplete and sparse, meaning that only a small fraction of the stored torque sequences are used when learning a new movement. The library is also robust in the sense that, after an initial training period, nearly all novel movements can be learned as additive combinations of sequences in the library, and in the sense that it shows good generalization when an arm's dynamics are altered between training and test conditions, such as when a payload is added to the arm. Lastly, the GAR model works well regardless of whether motor tasks are specified in joint space or Cartesian space. We conclude that learning techniques using local primitives and optimization procedures are viable and potentially important methods for motor control and possibly other domains, and that these techniques deserve further examination by the artificial intelligence and cognitive science communities.

Robert A. Jacobs | Manu Chhabra | R. Jacobs | M. Chhabra

[1] N. A. Bernshteĭn. The co-ordination and regulation of movements , 1967 .

[2] Filson H. Glanz,et al. Application of a General Learning Algorithm to the Control of Robotic Manipulators , 1987 .

[3] 李幼升,et al. Ph , 1989 .

[4] F. A. Seiler,et al. Numerical Recipes in C: The Art of Scientific Computing , 1989 .

[5] David J. Reinkensmeyer,et al. Using associative content-addressable memories to control robots , 1989, Proceedings, 1989 International Conference on Robotics and Automation.

[6] Robert E. Schapire,et al. The strength of weak learnability , 1990, Mach. Learn..

[7] William H. Press,et al. Numerical recipes in C++: the art of scientific computing, 2nd Edition (C++ ed., print. is corrected to software version 2.10) , 1994 .

[8] R. J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[9] Stéphane Mallat,et al. Matching pursuits with time-frequency dictionaries , 1993, IEEE Trans. Signal Process..

[10] E. Bizzi,et al. Linear combinations of primitives in vertebrate motor control. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[11] Terence D. Sanger. Optimal Movement Primitives , 1994, NIPS.

[12] Terence D. Sanger,et al. Optimal unsupervised motor learning for dimensionality reduction of nonlinear control systems , 1994, IEEE Trans. Neural Networks.

[13] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[14] Peter I. Corke,et al. A robotics toolbox for MATLAB , 1996, IEEE Robotics Autom. Mag..

[15] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[16] Terrence J. Sejnowski,et al. Learning Overcomplete Representations , 2000, Neural Computation.

[17] Reza Shadmehr,et al. Learning of action through adaptive combination of motor primitives , 2000, Nature.

[18] J. Friedman. Greedy function approximation: A gradient boosting machine. , 2001 .

[19] Christopher G. Atkeson,et al. Learning from observation using primitives , 2001, Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164).

[20] Paul A. Viola,et al. Robust Real-Time Face Detection , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[21] Maja J. Mataric,et al. Automated Derivation of Primitives for Movement Classification , 2000, Auton. Robots.

[22] Jessica K. Hodgins,et al. Interactive control of avatars animated with human motion data , 2002, SIGGRAPH.

[23] Jun Nakanishi,et al. Learning Attractor Landscapes for Learning Motor Primitives , 2002, NIPS.

[24] Thomas G. Dietterich,et al. Editors. Advances in Neural Information Processing Systems , 2002 .

[25] Zoubin Ghahramani,et al. Unsupervised learning of sensory-motor primitives , 2003, Proceedings of the 25th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (IEEE Cat. No.03CH37439).

[26] Peter Buhlmann. Boosting Methods: Why They Can Be Us eful for High-Dimensional Data , 2003 .

[27] Emilio Bizzi,et al. Combinations of muscle synergies in the construction of a natural motor behavior , 2003, Nature Neuroscience.

[28] James Theiler,et al. Grafting: Fast, Incremental Feature Selection by Gradient Descent in Function Space , 2003, J. Mach. Learn. Res..

[29] Emanuel Todorov,et al. Iterative Linear Quadratic Regulator Design for Nonlinear Biological Movement Systems , 2004, ICINCO.

[30] Andrew W. Moore,et al. Locally Weighted Learning for Control , 1997, Artificial Intelligence Review.

[31] John M. Hollerbach,et al. Dynamic interactions between limb segments during planar arm movement , 1982, Biological Cybernetics.

[32] E. Todorov,et al. Analysis of the synergies underlying complex hand manipulation , 2004, The 26th Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[33] Jessica K. Hodgins,et al. Synthesizing physically realistic human motion in low-dimensional, behavior-specific spaces , 2004, SIGGRAPH 2004.

[34] Maja J. Mataric,et al. A spatio-temporal extension to Isomap nonlinear dimension reduction , 2004, ICML.

[35] M. Kawato,et al. A hierarchical neural-network model for control and learning of voluntary movement , 2004, Biological Cybernetics.

[36] Manfred Lau,et al. Behavior planning for character animation , 2005, SCA '05.

[37] Christopher G. Atkeson,et al. Policies based on trajectory libraries , 2006, Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006..

[38] Stefan Schaal,et al. Reinforcement Learning for Parameterized Motor Primitives , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[39] Michael S. Lewicki,et al. Efficient auditory coding , 2006, Nature.

[40] Robert A. Jacobs,et al. Properties of Synergies Arising from a Theory of Optimal Motor Behavior , 2006, Neural Computation.

[41] William D. Smart,et al. Receding Horizon Differential Dynamic Programming , 2007, NIPS.