An Adaptive Optimal Controller for Discrete-Time Markov Environments
暂无分享,去创建一个
[1] Bernard Widrow,et al. Punish/Reward: Learning with a Critic in Adaptive Threshold Systems , 1973, IEEE Trans. Syst. Man Cybern..
[2] Ian H. Witten. Finite-Time Performance of Some Two-Armed Bandit Controllers , 1973, IEEE Trans. Syst. Man Cybern..
[3] Ian H. Witten,et al. Human operators and automatic adaptive controllers: A comparative study on a particular control task , 1973 .
[4] Thomas M. Cover,et al. The two-armed-bandit problem with time-invariant finite memory , 1970, IEEE Trans. Inf. Theory.
[5] Kumpati S. Narendra,et al. Use of Stochastic Automata for Parameter Self-Optimization with Multimodal Performance Criteria , 1969, IEEE Trans. Syst. Sci. Cybern..
[6] Marvin Minsky,et al. Perceptrons: An Introduction to Computational Geometry , 1969 .
[7] J. Nash. NON-COOPERATIVE GAMES , 1951, Classics in Game Theory.