论文信息 - Controlled Markov set-chains under average criteria - 字舞流文

Controlled Markov set-chains under average criteria

In this paper, applying an interval arithmetic analysis, we consider the average case of controlled Markov set-chains, whose process allows for fluctuating transition matrices at each step in time. We introduce a v-step contractive property for the average case, under which a Pareto optimal periodic policy is characterized as a maximal solution of optimality equation. Also, in the class of stationary policies, the behavior of the expected reward over T-horizon as T approaches ~ is investigated and the left- and right-hand side optimality equations are given, by which a Pareto optimal stationary policy is found. As a numerical example, the Taxicab problem is considered.

Masanori Hosaka | Masayuki Horiguchi | Masami Kurano | M. Kurano | M. Horiguchi | M. Hosaka

[1] D. White. Multi-objective infinite-horizon discounted Markov decision processes , 1982 .

[2] D. J. Hartfiel. Component bounds on Markov set-chain limiting sets , 1991 .

[3] Darald J. Hartfiel,et al. Markov Set-Chains , 1998 .

[4] John G. Kemeny,et al. Finite Markov Chains. , 1960 .

[5] A. Neumaier. New techniques for the analysis of linear interval equations , 1984 .

[6] Ronald A. Howard,et al. Dynamic Programming and Markov Processes , 1960 .

[7] Nagata Furukawa,et al. Characterization of Optimal Policies in Vector-Valued Markovian Decision Processes , 1980, Math. Oper. Res..

[8] E. Seneta,et al. On the theory of Markov set-chains , 1994, Advances in Applied Probability.

[9] Masanori Hosaka,et al. NON-DISCOUNTED OPTIMAL POLICIES IN CONTROLLED MARKOV SET-CHAINS , 1999 .

[10] Yukio Takahashi. A Weak D-Markov Chain Approach to Tandem Queueing Networks , 1987, Computer Performance and Reliability.

[11] Valerie Isham,et al. Non‐Negative Matrices and Markov Chains , 1983 .

[12] J. Bather. Optimal decision procedures for finite Markov chains. Part II: Communicating systems , 1973, Advances in Applied Probability.

[13] O. Hernández-Lerma. Adaptive Markov Control Processes , 1989 .

[14] Masanori Hosaka,et al. CONTROLLED MARKOV SET-CHAINS WITH DISCOUNTING , 1998 .

[15] N. L. Lawrie,et al. Mathematical Computer Performance and Reliability , 1984 .

[16] D. Blackwell. Discrete Dynamic Programming , 1962 .

[17] J. Bather. Optimal decision procedures for finite markov chains. Part I: Examples , 1973, Advances in Applied Probability.

[18] K. Hinderer,et al. Foundations of Non-stationary Dynamic Programming with Discrete Time Parameter , 1970 .

[19] John G. Kemeny,et al. Finite Markov chains , 1960 .

[20] Masami Kurano,et al. Markov Decision Processes with a Borel Measurable Cost Function - The Average Case , 1986, Math. Oper. Res..

[21] Masanori Hosaka,et al. A Span Seminorm Approach to Controlled Markov Set-Chains(III : Natural Sciences) , 1998 .

[22] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .