A new family of optimal adaptive controllers for Markov chains
暂无分享,去创建一个
We consider the problem of adaptively controlling a Markov chain with unknown transition probabilities. A new family of adaptive controllers is exhibited which achieves a performance precisely equalling the optimal performance achievable if the transition probabilities (i.e., the model or dynamics of the system) were known instead. Hence, the adaptive controllers presented here are truly optimal The performance of the system to be controlled is measured by the average of the costs incurred over an infinite operating time period. These adaptive controllers can, potentially, be implemented on digital computers and used in the on-line control of unknown systems.
[1] P. Mandl,et al. Estimation and control in Markov chains , 1974, Advances in Applied Probability.
[2] P. Varaiya. Optimal and suboptimal stationary controls for Markov chains , 1978 .
[3] V. Borkar,et al. Adaptive control of Markov chains, I: Finite parameter set , 1979 .
[4] B. Doshi,et al. Strong consistency of a modified maximum likelihood estimator for controlled Markov chains , 1980 .
[5] P. R. Kumar,et al. Adaptive Control with a Compact Parameter Set , 1982 .