Learning Control in Robotics
暂无分享,去创建一个
[1] M. Ciletti,et al. The computation and theory of optimal control , 1972 .
[2] David Q. Mayne,et al. Differential dynamic programming , 1972, The Mathematical Gazette.
[3] Y. Bar-Shalom,et al. Wide-sense adaptive dual control for nonlinear stochastic systems , 1973 .
[4] Yaakov Bar-Shalom,et al. Caution, Probing, and the Value of Information in the Control of Uncertain Systems , 1976 .
[5] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .
[6] W. Cleveland. Robust Locally Weighted Regression and Smoothing Scatterplots , 1979 .
[7] Christopher G. Atkeson,et al. Using Local Models to Control Movement , 1989, NIPS.
[8] Vijaykumar Gullapalli,et al. A stochastic reinforcement learning algorithm for learning real-valued functions , 1990, Neural Networks.
[9] Michael I. Jordan,et al. Forward Models: Supervised Learning with a Distal Teacher , 1992, Cogn. Sci..
[10] R. J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[11] Michael I. Jordan,et al. Supervised learning from incomplete data via an EM approach , 1993, NIPS.
[12] M. Kawato,et al. Trajectory formation of arm movement by a neural network with forward and inverse dynamics models , 1993 .
[13] S. Grossberg,et al. A Self-Organizing Neural Model of Motor Equivalent Reaching and Tool Use by a Multijoint Arm , 1993, Journal of Cognitive Neuroscience.
[14] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[15] J. Spall,et al. Optimal random perturbations for stochastic approximation using a simultaneous perturbation gradient approximation , 1997, Proceedings of the 1997 American Control Conference (Cat. No.97CH36041).
[16] Geoffrey E. Hinton,et al. Using Expectation-Maximization for Reinforcement Learning , 1997, Neural Computation.
[17] Stefan Schaal,et al. Robot Learning From Demonstration , 1997, ICML.
[18] Stefan Schaal,et al. Learning tasks from a single demonstration , 1997, Proceedings of International Conference on Robotics and Automation.
[19] Christopher G. Atkeson,et al. Constructive Incremental Learning from Only Local Information , 1998, Neural Computation.
[20] D. Scully,et al. Observational learning in motor skill acquisition: A look at demonstrations , 1998 .
[21] D M Wolpert,et al. Multiple paired forward and inverse models for motor control , 1998, Neural Networks.
[22] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[23] Shun-ichi Amari,et al. Natural Gradient Learning for Over- and Under-Complete Bases in ICA , 1999, Neural Computation.
[24] Stefan Schaal,et al. Is imitation learning the route to humanoid robots? , 1999, Trends in Cognitive Sciences.
[25] Manfred Opper,et al. Sparse Representation for Gaussian Process Models , 2000, NIPS.
[26] Geoffrey E. Hinton,et al. Using EM for Reinforcement Learning , 2000 .
[27] Kenji Doya,et al. Reinforcement Learning in Continuous Time and Space , 2000, Neural Computation.
[28] Andrew Y. Ng,et al. Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.
[29] Bruno Siciliano,et al. Modelling and Control of Robot Manipulators , 1997, Advanced Textbooks in Control and Signal Processing.
[30] Michael I. Jordan,et al. PEGASUS: A policy search method for large MDPs and POMDPs , 2000, UAI.
[31] Stefan Schaal,et al. Learning inverse kinematics , 2001, Proceedings 2001 IEEE/RSJ International Conference on Intelligent Robots and Systems. Expanding the Societal Role of Robotics in the the Next Millennium (Cat. No.01CH37180).
[32] Jun Morimoto,et al. Acquisition of stand-up behavior by a real robot using hierarchical reinforcement learning , 2000, Robotics Auton. Syst..
[33] Sham M. Kakade,et al. A Natural Policy Gradient , 2001, NIPS.
[34] Douglas Aberdeen,et al. Scalable Internal-State Policy-Gradient Methods for POMDPs , 2002, ICML.
[35] Jun Nakanishi,et al. Learning Attractor Landscapes for Learning Motor Primitives , 2002, NIPS.
[36] Jonathan Baxter,et al. Scaling Internal-State Policy-Gradient Methods for POMDPs , 2002 .
[37] Stefan Schaal,et al. Computational approaches to motor learning by imitation. , 2003, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.
[38] Andrew W. Moore,et al. Locally Weighted Learning for Control , 1997, Artificial Intelligence Review.
[39] H. Sebastian Seung,et al. Stochastic policy gradient reinforcement learning on a simple 3D biped , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).
[40] Huan Liu,et al. Customer Retention via Data Mining , 2000, Artificial Intelligence Review.
[41] Andrew W. Moore,et al. Locally Weighted Learning , 1997, Artificial Intelligence Review.
[42] Yoshihiko Nakamura,et al. Embodied Symbol Emergence Based on Mimesis Theory , 2004, Int. J. Robotics Res..
[43] Andrew W. Moore,et al. Variable Resolution Discretization in Optimal Control , 2002, Machine Learning.
[44] Jessica K. Hodgins,et al. Synthesizing physically realistic human motion in low-dimensional, behavior-specific spaces , 2004, SIGGRAPH 2004.
[45] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[46] H. Kappen. Linear theory for control of nonlinear stochastic systems. , 2004, Physical review letters.
[47] Pierre Geurts,et al. Tree-Based Batch Mode Reinforcement Learning , 2005, J. Mach. Learn. Res..
[48] Deb Roy,et al. Connecting language to the world , 2005, Artif. Intell..
[49] Geoffrey J. Gordon,et al. Finding Approximate POMDP solutions Through Belief Compression , 2011, J. Artif. Intell. Res..
[50] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[51] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[52] Stefan Schaal,et al. Incremental Online Learning in High Dimensions , 2005, Neural Computation.
[53] Stefan Schaal,et al. Natural Actor-Critic , 2003, Neurocomputing.
[54] Warren B. Powell,et al. Handbook of Learning and Approximate Dynamic Programming , 2006, IEEE Transactions on Automatic Control.
[55] Marc Toussaint,et al. Probabilistic inference for solving discrete and continuous state Markov Decision Processes , 2006, ICML.
[56] Stefan Schaal,et al. Policy Gradient Methods for Robotics , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[57] Sanjiv Singh,et al. The 2005 DARPA Grand Challenge , 2007 .
[58] Radford M. Neal. Pattern Recognition and Machine Learning , 2007, Technometrics.
[59] C. Atkeson. Randomly Sampling Actions In Dynamic Programming , 2007, 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning.
[60] Stefan Schaal,et al. The New Robotics—towards Human-centered Machines , 2007 .
[61] H. Kappen. An introduction to stochastic control theory, path integrals and reinforcement learning , 2007 .
[62] Sanjiv Singh,et al. The 2005 DARPA Grand Challenge: The Great Robot Race , 2007 .
[63] Christopher G. Atkeson,et al. Random Sampling of States in Dynamic Programming , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).
[64] Duy Nguyen-Tuong,et al. Local Gaussian Process Regression for Real Time Online Model Learning , 2008, NIPS.
[65] Jürgen Schmidhuber,et al. State-Dependent Exploration for Policy Gradient Methods , 2008, ECML/PKDD.
[66] Stefan Schaal,et al. A Bayesian approach to empirical local linearization for robotics , 2008, 2008 IEEE International Conference on Robotics and Automation.
[67] Jan Peters,et al. Fitted Q-iteration by Advantage Weighted Regression , 2008, NIPS.
[68] Jun Morimoto,et al. Learning CPG-based Biped Locomotion with a Policy Gradient Method: Application to a Humanoid Robot , 2005, 5th IEEE-RAS International Conference on Humanoid Robots, 2005..
[69] Stefan Schaal,et al. Robot Programming by Demonstration , 2009, Springer Handbook of Robotics.
[70] Stefan Schaal,et al. Learning to Control in Operational Space , 2008, Int. J. Robotics Res..
[71] Stefan Schaal,et al. 2008 Special Issue: Reinforcement learning of motor skills with policy gradients , 2008 .
[72] Sanjiv Singh,et al. The DARPA Urban Challenge: Autonomous Vehicles in City Traffic, George Air Force Base, Victorville, California, USA , 2009, The DARPA Urban Challenge.
[73] Emanuel Todorov,et al. Efficient computation of optimal actions , 2009, Proceedings of the National Academy of Sciences.
[74] Christopher G. Atkeson,et al. Standing balance control using a trajectory library , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[75] Stefan Schaal,et al. Path integral-based stochastic optimal control for rigid body dynamics , 2009, 2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning.
[76] David Silver,et al. Learning to search: Functional gradient techniques for imitation learning , 2009, Auton. Robots.
[77] Christopher G. Atkeson,et al. Control of a walking biped using a combination of simple policies , 2009, 2009 9th IEEE-RAS International Conference on Humanoid Robots.
[78] Carl E. Rasmussen,et al. Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.
[79] Marc Toussaint,et al. Learning model-free robot control by a Monte Carlo EM algorithm , 2009, Auton. Robots.
[80] Carl E. Rasmussen,et al. Gaussian process dynamic programming , 2009, Neurocomputing.
[81] Jan Peters,et al. Practical Learning Algorithms for Motor Primitives in Robotics , 2010 .
[82] D. Prattichizzo. Ieee Robotics & Automation Magazine Robotics in Second Life , .