暂无分享,去创建一个
Shie Mannor | Tom Zahavy | Chen Tessler | Philip Korsunsky | Stav Belogolovsky | Shie Mannor | Tom Zahavy | Chen Tessler | Philip Korsunsky | Stav Belogolovsky
[1] Shie Mannor,et al. Contextual Markov Decision Processes , 2015, ArXiv.
[2] Michael Kearns,et al. Near-Optimal Reinforcement Learning in Polynomial Time , 2002, Machine Learning.
[3] Martin Jaggi,et al. Revisiting Frank-Wolfe: Projection-Free Sparse Convex Optimization , 2013, ICML.
[4] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.
[5] Nan Jiang,et al. Repeated Inverse Reinforcement Learning , 2017, NIPS.
[6] Martin Zinkevich,et al. Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.
[7] Mordecai Avriel,et al. Mathematical Programming for Industrial Engineers , 1997 .
[8] Haim Kaplan,et al. Average reward reinforcement learning with unknown mixing times , 2019, ArXiv.
[9] Michael H. Bowling,et al. Apprenticeship learning using linear programming , 2008, ICML '08.
[10] Peter Szolovits,et al. Deep Reinforcement Learning for Sepsis Treatment , 2017, ArXiv.
[11] Moustapha Cissé,et al. Parseval Networks: Improving Robustness to Adversarial Examples , 2017, ICML.
[12] J. Andrew Bagnell,et al. Efficient Reductions for Imitation Learning , 2010, AISTATS.
[13] Elad Hazan,et al. Introduction to Online Convex Optimization , 2016, Found. Trends Optim..
[14] Peter Szolovits,et al. MIMIC-III, a freely accessible critical care database , 2016, Scientific Data.
[15] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.
[16] Jonathan Lee,et al. Iterative Noise Injection for Scalable Imitation Learning , 2017, ArXiv.
[17] Robert E. Schapire,et al. A Game-Theoretic Approach to Apprenticeship Learning , 2007, NIPS.
[18] Srivatsan Srinivasan,et al. Truly Batch Apprenticeship Learning with Deep Successor Features , 2019, IJCAI.
[19] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[20] C. Blair. Problem Complexity and Method Efficiency in Optimization (A. S. Nemirovsky and D. B. Yudin) , 1985 .
[21] Stefano Ermon,et al. Generative Adversarial Imitation Learning , 2016, NIPS.
[22] Shamim Nemati,et al. Does the "Artificial Intelligence Clinician" learn optimal treatment strategies for sepsis in intensive care? , 2019, ArXiv.
[23] Sébastien Bubeck,et al. Convex Optimization: Algorithms and Complexity , 2014, Found. Trends Mach. Learn..
[24] S. Murphy,et al. Dynamic Treatment Regimes. , 2014, Annual review of statistics and its application.
[25] Stephen P. Boyd,et al. Linear controller design: limits of performance , 1991 .
[26] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.
[27] Ambuj Tewari,et al. Contextual Markov Decision Processes using Generalized Linear Models , 2019, ArXiv.
[28] Marc Teboulle,et al. Mirror descent and nonlinear projected subgradient methods for convex optimization , 2003, Oper. Res. Lett..
[29] Xi Chen,et al. Evolution Strategies as a Scalable Alternative to Reinforcement Learning , 2017, ArXiv.
[30] A. Slooter,et al. Intraoperative hypotension and the risk of postoperative adverse outcomes: a systematic review , 2018, British journal of anaesthesia.
[31] John Langford,et al. Approximately Optimal Approximate Reinforcement Learning , 2002, ICML.
[32] Haim Kaplan,et al. Apprenticeship Learning via Frank-Wolfe , 2019, AAAI.
[33] Pieter Abbeel,et al. Exploration and apprenticeship learning in reinforcement learning , 2005, ICML.
[34] Atul Malhotra,et al. Personalizing mechanical ventilation for acute respiratory distress syndrome. , 2016, Journal of thoracic disease.
[35] Tom Schaul,et al. Successor Features for Transfer in Reinforcement Learning , 2016, NIPS.
[36] Xin Zhang,et al. End to End Learning for Self-Driving Cars , 2016, ArXiv.
[37] Anca D. Dragan,et al. Learning a Prior over Intent via Meta-Inverse Reinforcement Learning , 2018, ICML.
[38] Barbara E. Engelhardt,et al. A Reinforcement Learning Approach to Weaning of Mechanical Ventilation in Intensive Care Units , 2017, UAI.
[39] Fredrik D. Johansson,et al. Guidelines for reinforcement learning in healthcare , 2019, Nature Medicine.
[40] Richard S. Zemel,et al. SMILe: Scalable Meta Inverse Reinforcement Learning through Context-Conditional Policies , 2019, NeurIPS.
[41] Dean Pomerleau,et al. ALVINN, an autonomous land vehicle in a neural network , 2015 .
[42] Siddhartha S. Srinivasa,et al. Imitation learning for locomotion and manipulation , 2007, 2007 7th IEEE-RAS International Conference on Humanoid Robots.
[43] Aldo A. Faisal,et al. The Artificial Intelligence Clinician learns optimal treatment strategies for sepsis in intensive care , 2018, Nature Medicine.
[44] Srivatsan Srinivasan,et al. Evaluating Reinforcement Learning Algorithms in Observational Health Settings , 2018, ArXiv.
[45] D. Murray,et al. Sepsis: Personalized Medicine Utilizing ‘Omic’ Technologies—A Paradigm Shift? , 2018, Healthcare.
[46] Yurii Nesterov,et al. Random Gradient-Free Minimization of Convex Functions , 2015, Foundations of Computational Mathematics.
[47] John Darzentas,et al. Problem Complexity and Method Efficiency in Optimization , 1983 .
[48] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[49] Sergey Levine,et al. From Language to Goals: Inverse Reinforcement Learning for Vision-Based Instruction Following , 2019, ICLR.
[50] Dimitri P. Bertsekas,et al. Nonlinear Programming , 1997 .
[51] Laurent Orseau,et al. AI Safety Gridworlds , 2017, ArXiv.
[52] Andrew Y. Ng,et al. Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.
[53] J. MacQueen. Some methods for classification and analysis of multivariate observations , 1967 .
[54] DAN GARBER,et al. A Linearly Convergent Variant of the Conditional Gradient Algorithm under Strong Convexity, with Applications to Online and Stochastic Optimization , 2016, SIAM J. Optim..
[55] H. Robbins. A Stochastic Approximation Method , 1951 .
[56] Nan Jiang,et al. Markov Decision Processes with Continuous Side Information , 2017, ALT.