暂无分享,去创建一个
Shie Mannor | Shirli Di-Castro Shashua | Shirli Di Castro Shashua | Dotan DiCastro | Shie Mannor | Dotan DiCastro
[1] Martha White,et al. Organizing Experience: a Deeper Look at Replay Mechanisms for Sample-Based Planning in Continuous State Domains , 2018, IJCAI.
[2] Marcin Andrychowicz,et al. Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research , 2018, ArXiv.
[3] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[4] Andrew J. Davison,et al. Transferring End-to-End Visuomotor Control from Simulation to Real World for a Multi-Stage Task , 2017, CoRL.
[5] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[6] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[7] Yevgen Chebotar,et al. Closing the Sim-to-Real Loop: Adapting Simulation Randomization with Real World Experience , 2018, 2019 International Conference on Robotics and Automation (ICRA).
[8] Hyunsoo Kim,et al. Learning to Discover Cross-Domain Relations with Generative Adversarial Networks , 2017, ICML.
[9] Sergey Levine,et al. Generalization through Simulation: Integrating Simulated and Real Data into Deep Reinforcement Learning for Vision-Based Autonomous Flight , 2019, 2019 International Conference on Robotics and Automation (ICRA).
[10] Shalabh Bhatnagar,et al. Incremental Natural Actor-Critic Algorithms , 2007, NIPS.
[11] V. Borkar. Stochastic Approximation: A Dynamical Systems Viewpoint , 2008, Texts and Readings in Mathematics.
[12] Shalabh Bhatnagar,et al. A simultaneous perturbation stochastic approximation-based actor-critic algorithm for Markov decision processes , 2004, IEEE Transactions on Automatic Control.
[13] Sergey Levine,et al. Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection , 2016, Int. J. Robotics Res..
[14] Ron Meir,et al. A Convergent Online Single Time Scale Actor Critic Algorithm , 2009, J. Mach. Learn. Res..
[15] Wojciech Zaremba,et al. Domain randomization for transferring deep neural networks from simulation to the real world , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[16] Long-Ji Lin,et al. Reinforcement learning for robots using neural networks , 1992 .
[17] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[18] U. Rieder,et al. Markov Decision Processes , 2010 .
[19] Sergey Levine,et al. QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation , 2018, CoRL.
[20] Marcin Andrychowicz,et al. Sim-to-Real Transfer of Robotic Control with Dynamics Randomization , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[21] James Zou,et al. The Effects of Memory Replay in Reinforcement Learning , 2017, 2018 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton).
[22] Sergey Levine,et al. Collective robot reinforcement learning with distributed asynchronous guided policy search , 2016, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[23] François Laviolette,et al. Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..
[24] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[25] David Budden,et al. Distributed Prioritized Experience Replay , 2018, ICLR.
[26] H. Kushner,et al. Stochastic Approximation and Recursive Algorithms and Applications , 2003 .
[27] Yingbin Liang,et al. Finite-Sample Analysis for SARSA with Linear Function Approximation , 2019, NeurIPS.
[28] Shie Mannor,et al. Finite Sample Analyses for TD(0) With Function Approximation , 2017, AAAI.
[29] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[30] Sergey Levine,et al. Sim-To-Real via Sim-To-Sim: Data-Efficient Robotic Grasping via Randomized-To-Canonical Adaptation Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[31] Gerardo Rubino,et al. Introduction to Rare Event Simulation , 2009, Rare Event Simulation using Monte Carlo Methods.
[32] Harold J. Kushner,et al. wchastic. approximation methods for constrained and unconstrained systems , 1978 .
[33] Stefano Ermon,et al. A DIRT-T Approach to Unsupervised Domain Adaptation , 2018, ICLR.
[34] Yoshua Bengio,et al. Revisiting Fundamentals of Experience Replay , 2020, ICML.
[35] Dieter Fox,et al. BayesSim: adaptive domain randomization via probabilistic inference for robotics simulators , 2019, Robotics: Science and Systems.
[36] Inman Harvey,et al. Noise and the Reality Gap: The Use of Simulation in Evolutionary Robotics , 1995, ECAL.
[37] Tomas Pfister,et al. Learning from Simulated and Unsupervised Images through Adversarial Training , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[38] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[39] Michael I. Jordan,et al. Learning Transferable Features with Deep Adaptation Networks , 2015, ICML.
[40] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[41] Henrik I. Christensen,et al. How to pick the domain randomization parameters for sim-to-real transfer of reinforcement learning policies? , 2019, ArXiv.
[42] Abhinav Gupta,et al. Supersizing self-supervision: Learning to grasp from 50K tries and 700 robot hours , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).
[43] George Trigeorgis,et al. Domain Separation Networks , 2016, NIPS.
[44] Sergey Levine,et al. (CAD)$^2$RL: Real Single-Image Flight without a Single Real Image , 2016, Robotics: Science and Systems.
[45] Shalabh Bhatnagar,et al. Natural actor-critic algorithms , 2009, Autom..
[46] Daochen Zha,et al. Experience Replay Optimization , 2019, IJCAI.
[47] Richard S. Sutton,et al. A Deeper Look at Experience Replay , 2017, ArXiv.
[48] Jakub W. Pachocki,et al. Learning dexterous in-hand manipulation , 2018, Int. J. Robotics Res..
[49] Quanquan Gu,et al. A Finite Time Analysis of Two Time-Scale Actor Critic Methods , 2020, NeurIPS.
[50] Marcin Andrychowicz,et al. Hindsight Experience Replay , 2017, NIPS.