暂无分享,去创建一个
[1] Daniel A. Braun,et al. Generalized Thompson sampling for sequential decision-making and causal inference , 2013, Complex Adapt. Syst. Model..
[2] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[3] Majid Nili Ahmadabadi,et al. Online learning of task-driven object-based visual attention control , 2010, Image Vis. Comput..
[4] Daniel A. Braun,et al. Thermodynamics as a theory of decision-making with information-processing costs , 2012, Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences.
[5] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[6] Richard S. Sutton,et al. Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding , 1995, NIPS.
[7] Shie Mannor,et al. Thompson Sampling for Complex Online Problems , 2013, ICML.
[8] Majid Nili Ahmadabadi,et al. Interactive Learning in Continuous Multimodal Space: A Bayesian Approach to Action-Based Soft Partitioning and Learning , 2012, IEEE Transactions on Autonomous Mental Development.
[9] H. Callen. Thermodynamics and an Introduction to Thermostatistics , 1988 .
[10] Wei Wang,et al. Recommender system application developments: A survey , 2015, Decis. Support Syst..
[11] Felipe Leno da Silva,et al. Object-Oriented Curriculum Generation for Reinforcement Learning , 2018, AAMAS.
[12] Majid Nili Ahmadabadi,et al. Learning sequential visual attention control through dynamic state space discretization , 2009, 2009 IEEE International Conference on Robotics and Automation.
[13] Peter Stone,et al. Learning Curriculum Policies for Reinforcement Learning , 2018, AAMAS.
[14] Zoran Popovic,et al. Efficient Bayesian Clustering for Reinforcement Learning , 2016, IJCAI.
[15] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[16] Alex Graves,et al. Strategic Attentive Writer for Learning Macro-Actions , 2016, NIPS.
[17] Logan T. Trujillo. Mental Effort and Information-Processing Costs Are Inversely Related to Global Brain Free Energy During Visual Categorization , 2019, Front. Neurosci..
[18] M. N. Ahmadabadi,et al. Reward Maximization Justifies the Transition from Sensory Selection at Childhood to Sensory Integration at Adulthood , 2014, PloS one.
[19] Stefan Wermter,et al. Real-world reinforcement learning for autonomous humanoid robot docking , 2012, Robotics Auton. Syst..
[20] W. Hoeffding. Probability Inequalities for sums of Bounded Random Variables , 1963 .
[21] Ian D. Watson,et al. Applying reinforcement learning to small scale combat in the real-time strategy game StarCraft:Broodwar , 2012, 2012 IEEE Conference on Computational Intelligence and Games (CIG).
[22] Jürgen Schmidhuber,et al. PowerPlay: Training an Increasingly General Problem Solver by Continually Searching for the Simplest Still Unsolvable Problem , 2011, Front. Psychol..
[23] Thierson Couto,et al. An evolutionary approach for combining results of recommender systems techniques based on Collaborative Filtering , 2014, 2014 IEEE Congress on Evolutionary Computation (CEC).
[24] Pieter Abbeel,et al. Learning vehicular dynamics, with application to modeling helicopters , 2005, NIPS.
[25] Andreas Holzinger,et al. Interactive machine learning for health informatics: when do we need the human-in-the-loop? , 2016, Brain Informatics.
[26] Gerald Tesauro,et al. Temporal Difference Learning and TD-Gammon , 1995, J. Int. Comput. Games Assoc..
[27] Rachel W Jackson,et al. Human-in-the-loop optimization of exoskeleton assistance during walking , 2017, Science.
[28] Martha White,et al. Interval Estimation for Reinforcement-Learning Algorithms in Continuous-State Domains , 2010, NIPS.
[29] Karl J. Friston,et al. Bayesian model selection for group studies , 2009, NeuroImage.
[30] Vytautas Perlibakas,et al. Distance measures for PCA-based face recognition , 2004, Pattern Recognit. Lett..
[31] Michael L. Littman,et al. State Abstractions for Lifelong Reinforcement Learning , 2018, ICML.
[32] Jivko Sinapov,et al. Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey , 2020, J. Mach. Learn. Res..
[33] R. Bellman. A Markovian Decision Process , 1957 .
[34] Michael L. Littman,et al. Near Optimal Behavior via Approximate State Abstraction , 2016, ICML.
[35] Daniel A. Braun,et al. Bounded Rational Decision-Making from Elementary Computations That Reduce Uncertainty , 2019, Entropy.
[36] Majid Nili Ahmadabadi,et al. Exploiting Generalization in the Subspaces for Faster Model-Based Reinforcement Learning , 2019, IEEE Transactions on Neural Networks and Learning Systems.
[37] Peter Auer,et al. Near-optimal Regret Bounds for Reinforcement Learning , 2008, J. Mach. Learn. Res..
[38] Daniel A. Braun,et al. Information, Utility and Bounded Rationality , 2011, AGI.
[39] I-Ming Chen,et al. Autonomous navigation of UAV by using real-time model-based reinforcement learning , 2016, 2016 14th International Conference on Control, Automation, Robotics and Vision (ICARCV).
[40] Jason Weston,et al. Dialogue Learning With Human-In-The-Loop , 2016, ICLR.
[41] E. Ordentlich,et al. Inequalities for the L1 Deviation of the Empirical Distribution , 2003 .
[42] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[43] Daniel A. Braun,et al. Hierarchical Expert Networks for Meta-Learning , 2019, ArXiv.
[44] Daniel Polani,et al. Information Theory of Decisions and Actions , 2011 .
[45] Jordi Grau-Moya,et al. Planning with Information-Processing Constraints and Model Uncertainty in Markov Decision Processes , 2016, ECML/PKDD.
[46] Zoubin Ghahramani,et al. Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.
[47] W. R. Thompson. ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .