Consistency of HDP applied to a simple reinforcement learning problem

[1]  Andrew G. Barto,et al.  Connectionist learning for control: an overview , 1990 .

[2]  Mitsuo Kawato,et al.  Computational schemes and neural network models for formulation and control of multijoint arm trajectory , 1990 .

[3]  Ronald J. Williams,et al.  Adaptive state representation and estimation using recurrent connectionist networks , 1990 .

[4]  Paul J. Werbos,et al.  Maximizing long-term gas industry profits in two minutes in Lotus using neural network methods , 1989, IEEE Trans. Syst. Man Cybern..

[5]  B. Widrow,et al.  The truck backer-upper: an example of self-learning in neural networks , 1989, International 1989 Joint Conference on Neural Networks.

[6]  Michael I. Jordan,et al.  Generic constraints on underspecified target trajectories , 1989, International 1989 Joint Conference on Neural Networks.

[7]  P. J. Werbos,et al.  Backpropagation and neurocontrol: a review and prospectus , 1989, International 1989 Joint Conference on Neural Networks.

[8]  P. J. Werbos,et al.  Backpropagation: past and future , 1988, IEEE 1988 International Conference on Neural Networks.

[9]  R. J. Williams,et al.  On the use of backpropagation in associative reinforcement learning , 1988, IEEE 1988 International Conference on Neural Networks.

[10]  PAUL J. WERBOS,et al.  Generalization of backpropagation with application to a recurrent gas market model , 1988, Neural Networks.

[11]  S. Grossberg,et al.  Neural dynamics of attentionally modulated Pavlovian conditioning: blocking, interstimulus interval, and secondary reinforcement. , 1987, Applied optics.

[12]  Lokendra Shastri,et al.  Learning Phonetic Features Using Connectionist Networks , 1987, IJCAI.

[13]  Richard S. Sutton,et al.  Temporal credit assignment in reinforcement learning , 1984 .

[14]  Richard S. Sutton,et al.  Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[15]  John S. Edwards,et al.  The Hedonistic Neuron: A Theory of Memory, Learning and Intelligence , 1983 .

[16]  Paul J. Werbos,et al.  Applications of advances in nonlinear sensitivity analysis , 1982 .

[17]  R. Godson Elements of intelligence , 1979 .

[18]  P. Werbos,et al.  Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .

[19]  David Q. Mayne,et al.  Differential dynamic programming , 1972, The Mathematical Gazette.

[20]  Michael D. Geurts,et al.  Time Series Analysis: Forecasting and Control , 1977 .

[21]  R. Howard Dynamic Programming and Markov Processes , 1960 .