Controlled Markov set-chains under average criteria

In this paper, applying an interval arithmetic analysis, we consider the average case of controlled Markov set-chains, whose process allows for fluctuating transition matrices at each step in time. We introduce a v-step contractive property for the average case, under which a Pareto optimal periodic policy is characterized as a maximal solution of optimality equation. Also, in the class of stationary policies, the behavior of the expected reward over T-horizon as T approaches ~ is investigated and the left- and right-hand side optimality equations are given, by which a Pareto optimal stationary policy is found. As a numerical example, the Taxicab problem is considered.

[1]  D. White Multi-objective infinite-horizon discounted Markov decision processes , 1982 .

[2]  D. J. Hartfiel Component bounds on Markov set-chain limiting sets , 1991 .

[3]  Darald J. Hartfiel,et al.  Markov Set-Chains , 1998 .

[4]  John G. Kemeny,et al.  Finite Markov Chains. , 1960 .

[5]  A. Neumaier New techniques for the analysis of linear interval equations , 1984 .

[6]  Ronald A. Howard,et al.  Dynamic Programming and Markov Processes , 1960 .

[7]  Nagata Furukawa,et al.  Characterization of Optimal Policies in Vector-Valued Markovian Decision Processes , 1980, Math. Oper. Res..

[8]  E. Seneta,et al.  On the theory of Markov set-chains , 1994, Advances in Applied Probability.

[9]  Masanori Hosaka,et al.  NON-DISCOUNTED OPTIMAL POLICIES IN CONTROLLED MARKOV SET-CHAINS , 1999 .

[10]  Yukio Takahashi A Weak D-Markov Chain Approach to Tandem Queueing Networks , 1987, Computer Performance and Reliability.

[11]  Valerie Isham,et al.  Non‐Negative Matrices and Markov Chains , 1983 .

[12]  J. Bather Optimal decision procedures for finite Markov chains. Part II: Communicating systems , 1973, Advances in Applied Probability.

[13]  O. Hernández-Lerma Adaptive Markov Control Processes , 1989 .

[14]  Masanori Hosaka,et al.  CONTROLLED MARKOV SET-CHAINS WITH DISCOUNTING , 1998 .

[15]  N. L. Lawrie,et al.  Mathematical Computer Performance and Reliability , 1984 .

[16]  D. Blackwell Discrete Dynamic Programming , 1962 .

[17]  J. Bather Optimal decision procedures for finite markov chains. Part I: Examples , 1973, Advances in Applied Probability.

[18]  K. Hinderer,et al.  Foundations of Non-stationary Dynamic Programming with Discrete Time Parameter , 1970 .

[19]  John G. Kemeny,et al.  Finite Markov chains , 1960 .

[20]  Masami Kurano,et al.  Markov Decision Processes with a Borel Measurable Cost Function - The Average Case , 1986, Math. Oper. Res..

[21]  Masanori Hosaka,et al.  A Span Seminorm Approach to Controlled Markov Set-Chains(III : Natural Sciences) , 1998 .

[22]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .