暂无分享,去创建一个
Majid Nili Ahmadabadi | Reshad Hosseini | Maryam Hashemzadeh | M. N. Ahmadabadi | Reshad Hosseini | Maryam Hashemzadeh
[1] Majid Nili Ahmadabadi,et al. Interactive Learning in Continuous Multimodal Space: A Bayesian Approach to Action-Based Soft Partitioning and Learning , 2012, IEEE Transactions on Autonomous Mental Development.
[2] Leslie Pack Kaelbling,et al. DetH*: Approximate Hierarchical Solution of Large Markov Decision Processes , 2011, IJCAI.
[3] Hamid Beigy,et al. A novel graphical approach to automatic abstraction in reinforcement learning , 2013, Robotics Auton. Syst..
[4] Michael L. Littman,et al. Reinforcement learning improves behaviour from evaluative feedback , 2015, Nature.
[5] Peter Auer,et al. Near-optimal Regret Bounds for Reinforcement Learning , 2008, J. Mach. Learn. Res..
[6] Richard S. Sutton,et al. Efficient planning in MDPs by small backups , 2013, ICML 2013.
[7] Y. Niv,et al. Discovering latent causes in reinforcement learning , 2015, Current Opinion in Behavioral Sciences.
[8] Majid Nili Ahmadabadi,et al. Context Transfer and Q-Transferable Tasks , 2015, AAAI Workshop: Knowledge, Skill, and Behavior Transfer in Autonomous Robots.
[9] R. Bellman,et al. Dynamic Programming and Markov Processes , 1960 .
[10] M. N. Ahmadabadi,et al. Reward Maximization Justifies the Transition from Sensory Selection at Childhood to Sensory Integration at Adulthood , 2014, PloS one.
[11] Majid Nili Ahmadabadi,et al. Expertness based cooperative Q-learning , 2002, IEEE Trans. Syst. Man Cybern. Part B.
[12] John N. Tsitsiklis,et al. Introduction to Probability , 2002 .
[13] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[14] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[15] Manuela M. Veloso,et al. Multiagent Systems: A Survey from a Machine Learning Perspective , 2000, Auton. Robots.
[16] Honglak Lee,et al. Action-Conditional Video Prediction using Deep Networks in Atari Games , 2015, NIPS.
[17] Michael L. Littman,et al. An analysis of model-based Interval Estimation for Markov Decision Processes , 2008, J. Comput. Syst. Sci..
[18] R. Bellman. A Markovian Decision Process , 1957 .
[19] Dana H. Ballard,et al. Learning to perceive and act by trial and error , 1991, Machine Learning.
[20] Chris Watkins,et al. Learning from delayed rewards , 1989 .
[21] Robert C. Wilson,et al. Reinforcement Learning in Multidimensional Environments Relies on Attention Mechanisms , 2015, The Journal of Neuroscience.
[22] E. Ordentlich,et al. Inequalities for the L1 Deviation of the Empirical Distribution , 2003 .
[23] Peter Auer,et al. Logarithmic Online Regret Bounds for Undiscounted Reinforcement Learning , 2006, NIPS.
[24] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..
[25] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[26] W. Hoeffding. Probability Inequalities for sums of Bounded Random Variables , 1963 .
[27] Majid Nili Ahmadabadi,et al. Cost-sensitive learning of top-down modulation for attentional control , 2009, Machine Vision and Applications.
[28] Thomas G. Dietterich,et al. Automatic discovery and transfer of MAXQ hierarchies , 2008, ICML '08.
[29] Majid Nili Ahmadabadi,et al. Context Transfer in Reinforcement Learning Using Action-Value Functions , 2014, Comput. Intell. Neurosci..
[30] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[31] Robert Givan,et al. Bounded Parameter Markov Decision Processes , 1997, ECP.