Consistency of HDP applied to a simple reinforcement learning problem
暂无分享,去创建一个
[1] Andrew G. Barto,et al. Connectionist learning for control: an overview , 1990 .
[2] Mitsuo Kawato,et al. Computational schemes and neural network models for formulation and control of multijoint arm trajectory , 1990 .
[3] Ronald J. Williams,et al. Adaptive state representation and estimation using recurrent connectionist networks , 1990 .
[4] Paul J. Werbos,et al. Maximizing long-term gas industry profits in two minutes in Lotus using neural network methods , 1989, IEEE Trans. Syst. Man Cybern..
[5] B. Widrow,et al. The truck backer-upper: an example of self-learning in neural networks , 1989, International 1989 Joint Conference on Neural Networks.
[6] Michael I. Jordan,et al. Generic constraints on underspecified target trajectories , 1989, International 1989 Joint Conference on Neural Networks.
[7] P. J. Werbos,et al. Backpropagation and neurocontrol: a review and prospectus , 1989, International 1989 Joint Conference on Neural Networks.
[8] P. J. Werbos,et al. Backpropagation: past and future , 1988, IEEE 1988 International Conference on Neural Networks.
[9] R. J. Williams,et al. On the use of backpropagation in associative reinforcement learning , 1988, IEEE 1988 International Conference on Neural Networks.
[10] PAUL J. WERBOS,et al. Generalization of backpropagation with application to a recurrent gas market model , 1988, Neural Networks.
[11] S. Grossberg,et al. Neural dynamics of attentionally modulated Pavlovian conditioning: blocking, interstimulus interval, and secondary reinforcement. , 1987, Applied optics.
[12] Lokendra Shastri,et al. Learning Phonetic Features Using Connectionist Networks , 1987, IJCAI.
[13] Richard S. Sutton,et al. Temporal credit assignment in reinforcement learning , 1984 .
[14] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[15] John S. Edwards,et al. The Hedonistic Neuron: A Theory of Memory, Learning and Intelligence , 1983 .
[16] Paul J. Werbos,et al. Applications of advances in nonlinear sensitivity analysis , 1982 .
[17] R. Godson. Elements of intelligence , 1979 .
[18] P. Werbos,et al. Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .
[19] David Q. Mayne,et al. Differential dynamic programming , 1972, The Mathematical Gazette.
[20] Michael D. Geurts,et al. Time Series Analysis: Forecasting and Control , 1977 .
[21] R. Howard. Dynamic Programming and Markov Processes , 1960 .