暂无分享,去创建一个
Rémi Munos | Bilal Piot | Mohammad Gheshlaghi Azar | Zhaohan Daniel Guo | Bernardo A. Pires | Toby Pohlen | Tobias Pohlen | R. Munos | B. A. Pires | Bilal Piot | M. G. Azar | Z. Guo | B. '. Pires
[1] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.
[2] H. B. Barlow,et al. Unsupervised Learning , 1989, Neural Computation.
[3] Ronald L. Rivest,et al. Inference of finite automata using homing sequences , 1989, STOC '89.
[4] Jürgen Schmidhuber,et al. Curious model-building control systems , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.
[5] W. Lovejoy. A survey of algorithmic methods for partially observed Markov decision processes , 1991 .
[6] Ronald L. Rivest,et al. Inference of finite automata using homing sequences , 1989, STOC '89.
[7] Michael I. Jordan,et al. Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems , 1994, NIPS.
[8] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[9] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[10] Alex M. Andrew,et al. Reinforcement Learning: : An Introduction , 1998 .
[11] A. Cassandra,et al. Exact and approximate algorithms for partially observable markov decision processes , 1998 .
[12] Craig Boutilier,et al. Decision-Theoretic Planning: Structural Assumptions and Computational Leverage , 1999, J. Artif. Intell. Res..
[13] Terrence J. Sejnowski,et al. Unsupervised Learning , 2018, Encyclopedia of GIS.
[14] Richard S. Sutton,et al. Predictive Representations of State , 2001, NIPS.
[15] Carlos Guestrin,et al. Generalizing plans to new environments in relational MDPs , 2003, IJCAI 2003.
[16] Eric R. Ziegel,et al. The Elements of Statistical Learning , 2003, Technometrics.
[17] Michael R. James,et al. Predictive State Representations: A New Theory for Modeling Dynamical Systems , 2004, UAI.
[18] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[19] Yoshua Bengio,et al. Greedy Layer-Wise Training of Deep Networks , 2006, NIPS.
[20] Alan Fern,et al. Multi-task reinforcement learning: a hierarchical Bayesian approach , 2007, ICML '07.
[21] Andre Cohen,et al. An object-oriented representation for efficient reinforcement learning , 2008, ICML '08.
[22] Andrew Y. Ng,et al. Near-Bayesian exploration in polynomial time , 2009, ICML '09.
[23] Yoshua Bengio,et al. Why Does Unsupervised Pre-training Help Deep Learning? , 2010, AISTATS.
[24] Richard L. Lewis,et al. Variance-Based Rewards for Approximate Bayesian Reinforcement Learning , 2010, UAI.
[25] Byron Boots,et al. Closing the learning-planning loop with predictive state representations , 2009, Int. J. Robotics Res..
[26] Aapo Hyvärinen,et al. Noise-contrastive estimation: A new estimation principle for unnormalized statistical models , 2010, AISTATS.
[27] Patrick M. Pilarski,et al. Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction , 2011, AAMAS.
[28] Michael L. Littman,et al. Learning is planning: near Bayes-optimal reinforcement learning via Monte-Carlo tree search , 2011, UAI.
[29] Aapo Hyvärinen,et al. Noise-Contrastive Estimation of Unnormalized Statistical Models, with Applications to Natural Image Statistics , 2012, J. Mach. Learn. Res..
[30] Pascal Vincent,et al. Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[31] Daan Wierstra,et al. Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.
[32] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[33] Anima Anandkumar,et al. Tensor decompositions for learning latent variable models , 2012, J. Mach. Learn. Res..
[34] Yoshua Bengio,et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.
[35] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[36] Shie Mannor,et al. Bayesian Reinforcement Learning: A Survey , 2015, Found. Trends Mach. Learn..
[37] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[38] Jianfeng Gao,et al. Recurrent Reinforcement Learning: A Hybrid Approach , 2015, ArXiv.
[39] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[40] Dit-Yan Yeung,et al. Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting , 2015, NIPS.
[41] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[42] Yuan Yu,et al. TensorFlow: A system for large-scale machine learning , 2016, OSDI.
[43] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[44] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[45] Vladlen Koltun,et al. Learning to Act by Predicting the Future , 2016, ICLR.
[46] Christopher Burgess,et al. DARLA: Improving Zero-Shot Transfer in Reinforcement Learning , 2017, ICML.
[47] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.
[48] Guillaume Desjardins,et al. Understanding disentangling in β-VAE , 2018, ArXiv.
[49] Misha Denil,et al. Learning Awareness Models , 2018, ICLR.
[50] David Hsu,et al. Integrating Algorithmic Planning and Deep Learning for Partially Observable Navigation , 2018, ArXiv.
[51] Guillaume Desjardins,et al. Understanding disentangling in $\beta$-VAE , 2018, 1804.03599.
[52] Predictive State Representations of Controlled Models , 2018 .
[53] Shimon Whiteson,et al. Deep Variational Reinforcement Learning for POMDPs , 2018, ICML.
[54] Oriol Vinyals,et al. Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.
[55] Joel Z. Leibo,et al. Unsupervised Predictive Memory in a Goal-Directed Agent , 2018, ArXiv.
[56] Koray Kavukcuoglu,et al. Neural scene representation and rendering , 2018, Science.