Distributional Reinforcement Learning with Quantile Regression
暂无分享,去创建一个
Marc G. Bellemare | Rémi Munos | Mark Rowland | Will Dabney | R. Munos | Mark Rowland | Will Dabney | M. Rowland
[1] Frederick R. Forst,et al. On robust estimation of the location parameter , 1980 .
[2] D. Freedman,et al. Some Asymptotic Theory for the Bootstrap , 1981 .
[3] S. Goldstein,et al. On intrinsic randomness of dynamical systems , 1981 .
[4] Mahesan Niranjan,et al. On-line Q-learning using connectionist systems , 1994 .
[5] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[6] Matthias Heger,et al. Consideration of Risk in Reinforcement Learning , 1994, ICML.
[7] John N. Tsitsiklis,et al. Analysis of Temporal-Diffference Learning with Function Approximation , 1996, NIPS.
[8] A. Müller. Integral Probability Metrics and Their Generating Classes of Functions , 1997, Advances in Applied Probability.
[9] Stuart J. Russell,et al. Bayesian Q-Learning , 1998, AAAI/IAAI.
[10] James W. Taylor. A Quantile Regression Approach to Estimating the Distribution of Multiperiod Returns , 1999 .
[11] Peter J. Bickel,et al. The Earth Mover's distance is the Mallows distance: some insights from statistics , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.
[12] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[13] Sean R Eddy,et al. What is dynamic programming? , 2004, Nature Biotechnology.
[14] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[15] Shie Mannor,et al. Reinforcement learning with Gaussian processes , 2005, ICML.
[16] Masashi Sugiyama,et al. Parametric Return Density Estimation for Reinforcement Learning , 2010, UAI.
[17] Ronny Luss,et al. Sparse Quantile Huber Regression for Efficient and Robust Estimation , 2014, ArXiv.
[18] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[19] Shane Legg,et al. Massively Parallel Methods for Deep Reinforcement Learning , 2015, ArXiv.
[20] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[21] Shie Mannor,et al. Risk-Sensitive and Robust Decision-Making: a CVaR Optimization Approach , 2015, NIPS.
[22] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents (Extended Abstract) , 2012, IJCAI.
[23] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[24] Tom Schaul,et al. Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.
[25] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[26] Léon Bottou,et al. Wasserstein Generative Adversarial Networks , 2017, ICML.
[27] Marc G. Bellemare,et al. A Distributional Perspective on Reinforcement Learning , 2017, ICML.
[28] Marc G. Bellemare,et al. The Cramer Distance as a Solution to Biased Wasserstein Gradients , 2017, ArXiv.
[29] Naomi S. Altman,et al. Quantile regression , 2019, Nature Methods.