The Steering Approach for Multi-Criteria Reinforcement Learning
暂无分享,去创建一个
[1] A. Shwartz,et al. Guaranteed performance regions in Markovian systems with competing decision makers , 1993, IEEE Trans. Autom. Control..
[2] Cyrus Derman,et al. Finite State Markovian Decision Processes , 1970 .
[3] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[4] Richard S. Sutton,et al. Reinforcement Learning , 1992, Handbook of Machine Learning.
[5] J. Filar,et al. Competitive Markov Decision Processes , 1996 .
[6] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.
[7] Ronen I. Brafman,et al. A near-optimal polynomial time algorithm for learning in certain classes of stochastic games , 2000, Artif. Intell..
[8] Anton Schwartz,et al. A Reinforcement Learning Method for Maximizing Undiscounted Rewards , 1993, ICML.
[9] A. Neyman,et al. Stochastic games , 1981 .
[10] Vivek S. Borkar,et al. Learning Algorithms for Markov Decision Processes with Average Cost , 2001, SIAM J. Control. Optim..
[11] SRIDHAR MAHADEVAN,et al. Average Reward Reinforcement Learning: Foundations, Algorithms, and Empirical Results , 2005, Machine Learning.
[12] D. Blackwell. An analog of the minimax theorem for vector payoffs. , 1956 .