A Geometric Approach to Multi-Criterion Reinforcement Learning
暂无分享,去创建一个
[1] Andrew G. Barto,et al. Reinforcement learning , 1998 .
[2] H. Moskowitz,et al. Generalized dynamic programming for multicriteria optimization , 1990 .
[3] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..
[4] Peter L. Bartlett,et al. Neural Network Learning - Theoretical Foundations , 1999 .
[5] Peter L. Bartlett,et al. Learning in Neural Networks: Theoretical Foundations , 1999 .
[6] 丸山 徹. Convex Analysisの二,三の進展について , 1977 .
[7] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[8] Sridhar Mahadevan,et al. Average reward reinforcement learning: Foundations, algorithms, and empirical results , 2004, Machine Learning.
[9] Csaba Szepesvári,et al. A Unified Analysis of Value-Function-Based Reinforcement-Learning Algorithms , 1999, Neural Computation.
[10] Y. Freund,et al. The non-stochastic multi-armed bandit problem , 2001 .
[11] Vijay R. Konda,et al. OnActor-Critic Algorithms , 2003, SIAM J. Control. Optim..
[12] Cyrus Derman,et al. Finite State Markovian Decision Processes , 1970 .
[13] R. S. Laundy,et al. Multiple Criteria Optimisation: Theory, Computation and Application , 1989 .
[14] A. Tversky,et al. Prospect Theory : An Analysis of Decision under Risk Author ( s ) : , 2007 .
[15] Michael Kearns,et al. Near-Optimal Reinforcement Learning in Polynomial Time , 1998, Machine Learning.
[16] Timothy X. Brown,et al. Switch Packet Arbitration via Queue-Learning , 2001, NIPS.
[17] R. Karp,et al. On Nonterminating Stochastic Games , 1966 .
[18] S. Deming. Multiple-criteria optimization , 1991 .
[19] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[20] Ralph E. Steuer. Multiple criteria optimization , 1986 .
[21] D. White. Multi-objective infinite-horizon discounted Markov decision processes , 1982 .
[22] Konkoly Thege. Multi-criteria Reinforcement Learning , 1998 .
[23] Nahum Shimkin,et al. Stochastic Games with Average Cost Constraints , 1994 .
[24] A. Shwartz,et al. Guaranteed performance regions in Markovian systems with competing decision makers , 1993, IEEE Trans. Autom. Control..
[25] J. Filar,et al. Competitive Markov Decision Processes , 1996 .
[26] D. Blackwell. An analog of the minimax theorem for vector payoffs. , 1956 .
[27] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[28] John N. Tsitsiklis,et al. Actor-Critic Algorithms , 1999, NIPS.
[29] Alexander S. Poznyak,et al. Self-Learning Control of Finite Markov Chains , 2000 .
[30] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[31] Vivek S. Borkar,et al. Stochastic Approximation for Nonexpansive Maps: Application to Q-Learning Algorithms , 1997, SIAM J. Control. Optim..
[32] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.
[33] M. I. Henig. Vector-Valued Dynamic Programming , 1983 .
[34] Shie Mannor,et al. Reinforcement Learning for Average Reward Zero-Sum Games , 2004, COLT.
[35] Andrew Tridgell,et al. TDLeaf(lambda): Combining Temporal Difference Learning with Game-Tree Search , 1999, ArXiv.
[36] Tommi S. Jaakkola,et al. Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms , 2000, Machine Learning.
[37] Jonathan Baxter,et al. TDLeaf ( ) : Combining Temporal Difference Learning with Game-Tree Search , 1998 .
[38] E. Altman. Constrained Markov Decision Processes , 1999 .
[39] Gerald Tesauro,et al. TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play , 1994, Neural Computation.
[40] Anton Schwartz,et al. A Reinforcement Learning Method for Maximizing Undiscounted Rewards , 1993, ICML.
[41] Michael Kearns,et al. Finite-Sample Convergence Rates for Q-Learning and Indirect Algorithms , 1998, NIPS.
[42] Shie Mannor,et al. The Empirical Bayes Envelope and Regret Minimization in Competitive Markov Decision Processes , 2003, Math. Oper. Res..
[43] Herbert A. Simon,et al. The Sciences of the Artificial , 1970 .
[44] L. Shapley,et al. Stochastic Games* , 1953, Proceedings of the National Academy of Sciences.
[45] Eitan Altman,et al. Constrained Markov Games: Nash Equilibria , 2000 .