Algorithmic aspects of mean-variance optimization in Markov decision processes
暂无分享,去创建一个
[1] J. Michael Steele,et al. Markov Decision Problems Where Means Bound Variances , 2014, Oper. Res..
[2] Xianping Guo,et al. A mean-variance optimization problem for discounted Markov decision processes , 2012, Eur. J. Oper. Res..
[3] Shie Mannor,et al. Policy Gradients with Variance Related Risk Criteria , 2012, ICML.
[4] Javier de Frutos,et al. Approximation of Dynamic Programs , 2012 .
[5] Beatrice Gralton,et al. Washington DC - USA , 2008 .
[6] Sean P. Meyn. Control Techniques for Complex Networks: Workload , 2007 .
[7] J. Tsitsiklis,et al. Robust, risk-sensitive, and data-driven control of markov decision processes , 2007 .
[8] Sven Koenig,et al. Functional Value Iteration for Decision-Theoretic Planning with General Utility Functions , 2006, AAAI.
[9] Laurent El Ghaoui,et al. Robust Control of Markov Decision Processes with Uncertain Transition Matrices , 2005, Oper. Res..
[10] Sven Koenig,et al. Risk-Sensitive Planning with One-Switch Utility Functions: Value Iteration , 2005, AAAI.
[11] Garud Iyengar,et al. Robust Dynamic Programming , 2005, Math. Oper. Res..
[12] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[13] L. Ghaoui,et al. Robust markov decision processes with uncertain transition matrices , 2004 .
[14] Frank Riedel,et al. Dynamic Coherent Risk Measures , 2003 .
[15] Mihalis Yannakakis,et al. On the approximability of trade-offs and optimal access of Web sources , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.
[16] Philippe Artzner,et al. Coherent Measures of Risk , 1999 .
[17] E. Altman. Constrained Markov Decision Processes , 1999 .
[18] E. J. Collins. Finite-horizon variance penalised Markov decision processes , 1997 .
[19] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[20] Ying Huang,et al. On Finding Optimal Policies for Markov Decision Chains: A Unifying Framework for Mean-Variance-Tradeoffs , 1994, Math. Oper. Res..
[21] D. J. White. Computational approaches to variance-penalised Markov decision processes , 1992 .
[22] Jerzy A. Filar,et al. Variance-Penalized Markov Decision Processes , 1989, Math. Oper. Res..
[23] P. Whittle. Restless bandits: activity allocation in a changing world , 1988, Journal of Applied Probability.
[24] H. Kawai. A variance minimization problem for a Markov decision process , 1987 .
[25] M. J. Sobel,et al. Discounted MDP's: distribution functions and exponential utility maximization , 1987 .
[26] M. J. Sobel. The variance of discounted Markov decision processes , 1982, Journal of Applied Probability.
[27] Ward Whitt,et al. Approximations of Dynamic Programs, II , 1979, Math. Oper. Res..
[28] J. Gittins. Bandit processes and dynamic allocation indices , 1979 .
[29] Ward Whitt,et al. Approximations of Dynamic Programs, I , 1978, Math. Oper. Res..
[30] David S. Johnson,et al. Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .
[31] J. Cockcroft. Investment in Science , 1962, Nature.
[32] L. Shapley,et al. Stochastic Games* , 1953, Proceedings of the National Academy of Sciences.