Learning control of finite Markov chains with an explicit trade-off between estimation and control
暂无分享,去创建一个
[1] J. Spruce Riordon,et al. An adaptive automaton controller for discrete-time markov processes , 1969, Autom..
[2] J. S. Riordon. An adaptive automaton controller for discrete-time Markov processes , 1969 .
[3] M. Rothschild. A two-armed bandit theory of market pricing , 1974 .
[4] J. Alster,et al. A technique for dual adaptive control , 1974, Autom..
[5] P. Mandl,et al. Estimation and control in Markov chains , 1974, Advances in Applied Probability.
[6] Kumpati S. Narendra,et al. Learning Automata - A Survey , 1974, IEEE Trans. Syst. Man Cybern..
[7] I. Witten. The apparent conflict between estimation and control—a survey of the two-armed bandit problem , 1976 .
[8] V. Borkar,et al. Adaptive control of Markov chains, I: Finite parameter set , 1979, 1979 18th IEEE Conference on Decision and Control including the Symposium on Adaptive Processes.
[9] B. Doshi,et al. Strong consistency of a modified maximum likelihood estimator for controlled Markov chains , 1980 .
[10] Y. M. El-Fattah,et al. Recursive Algorithms for Adaptive Control of Finite Markov Chains , 1981, IEEE Trans. Syst. Man Cybern..
[11] P. Kumar,et al. Optimal adaptive controllers for unknown Markov chains , 1982 .
[12] P. Kumar,et al. A new family of optimal adaptive controllers for Markov chains , 1982 .
[13] Mitsuo Sato,et al. Learning control of finite Markov chains with unknown transition probabilities , 1982 .
[14] Mitsuo Sato,et al. An asymptotically optimal learning controller for finite Markov chains with unknown transition probabilities , 1985 .