Model Predictive Actor-Critic: Accelerating Robot Skill Acquisition with Deep Reinforcement Learning
暂无分享,去创建一个
Aaron M. Dollar | Jan Peters | Carlo D'Eramo | Georgia Chalvatzaki | Andrew S. Morgan | Daljeet Nandha | Jan Peters | A. Dollar | Carlo D'Eramo | G. Chalvatzaki | A. S. Morgan | Daljeet Nandha
[1] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[2] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[3] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[4] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[5] Wojciech Zaremba,et al. OpenAI Gym , 2016, ArXiv.
[6] Sergey Levine,et al. Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[7] Yuandong Tian,et al. Algorithmic Framework for Model-based Deep Reinforcement Learning with Theoretical Guarantees , 2018, ICLR.
[8] James M. Rehg,et al. Aggressive driving with model predictive path integral control , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).
[9] Sergey Levine,et al. Optimal control with learned local models: Application to dexterous manipulation , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).
[10] Sergey Levine,et al. Guided Policy Search , 2013, ICML.
[11] Tamim Asfour,et al. Model-Based Reinforcement Learning via Meta-Policy Optimization , 2018, CoRL.
[12] Byron Boots,et al. Differentiable MPC for End-to-end Planning and Control , 2018, NeurIPS.
[13] Christopher G. Atkeson,et al. Using Local Trajectory Optimizers to Speed Up Global Optimization in Dynamic Programming , 1993, NIPS.
[14] Gaurav S. Sukhatme,et al. Combining Model-Based and Model-Free Updates for Trajectory-Centric Reinforcement Learning , 2017, ICML.
[15] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[16] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[17] Sham M. Kakade,et al. Plan Online, Learn Offline: Efficient Learning and Exploration via Model-Based Control , 2018, ICLR.
[18] Sergey Levine,et al. Temporal Difference Models: Model-Free Deep RL for Model-Based Control , 2018, ICLR.
[19] Pieter Abbeel,et al. Model-Ensemble Trust-Region Policy Optimization , 2018, ICLR.
[20] Sergey Levine,et al. Deep Dynamics Models for Learning Dexterous Manipulation , 2019, CoRL.
[21] Aaron M. Dollar,et al. Towards Generalized Manipulation Learning Through Grasp Mechanics-Based Features and Self-Supervision , 2021, IEEE Transactions on Robotics.
[22] Sergey Levine,et al. Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[23] Henry Zhu,et al. Soft Actor-Critic Algorithms and Applications , 2018, ArXiv.
[24] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[25] Nolan Wagener,et al. Information theoretic MPC for model-based reinforcement learning , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[26] Sergey Levine,et al. Composable Deep Reinforcement Learning for Robotic Manipulation , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[27] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[28] Evangelos Theodorou,et al. MPC-Inspired Neural Network Policies for Sequential Decision Making , 2018, ArXiv.
[29] Petros Maragos,et al. Learn to Adapt to Human Walking: A Model-Based Reinforcement Learning Approach for a Robotic Assistant Rollator , 2019, IEEE Robotics and Automation Letters.
[30] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[31] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.
[32] Carl E. Rasmussen,et al. PILCO: A Model-Based and Data-Efficient Approach to Policy Search , 2011, ICML.
[33] Byron Boots,et al. Information Theoretic Model Predictive Q-Learning , 2020, L4DC.
[34] Aaron M. Dollar,et al. An underactuated hand for efficient finger-gaiting-based dexterous manipulation , 2014, 2014 IEEE International Conference on Robotics and Biomimetics (ROBIO 2014).
[35] Edwin Olson,et al. AprilTag 2: Efficient and robust fiducial detection , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[36] Kaiyu Hang,et al. Object-Agnostic Dexterous Manipulation of Partially Constrained Trajectories , 2020, IEEE Robotics and Automation Letters.
[37] Sergey Levine,et al. Learning Neural Network Policies with Guided Policy Search under Unknown Dynamics , 2014, NIPS.
[38] Jimmy Ba,et al. Exploring Model-based Planning with Policy Networks , 2019, ICLR.
[39] Sergey Levine,et al. Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models , 2018, NeurIPS.
[40] Marco Hutter,et al. Deep Value Model Predictive Control , 2019, CoRL.
[41] Pieter Abbeel,et al. Model-Augmented Actor-Critic: Backpropagating through Paths , 2020, ICLR.
[42] Sergey Levine,et al. When to Trust Your Model: Model-Based Policy Optimization , 2019, NeurIPS.
[43] Siddhartha S. Srinivasa,et al. Benchmarking in Manipulation Research: Using the Yale-CMU-Berkeley Object and Model Set , 2015, IEEE Robotics & Automation Magazine.
[44] William D. Smart,et al. Receding Horizon Differential Dynamic Programming , 2007, NIPS.