An on-line algorithm for dynamic reinforcement learning and planning in reactive environments
暂无分享,去创建一个
[1] Anthony J. Robinson,et al. Static and Dynamic Error Propagation Networks with Application to Speech Coding , 1987, NIPS.
[2] B. Widrow,et al. The truck backer-upper: an example of self-learning in neural networks , 1989, International 1989 Joint Conference on Neural Networks.
[3] Michael I. Jordan. Supervised learning and systems with excess degrees of freedom , 1988 .
[4] Jürgen Schmidhuber,et al. Recurrent networks adjusted by adaptive critics , 1990 .
[5] Paul J. Werbos,et al. Building and Understanding Adaptive Systems: A Statistical/Numerical Approach to Factory Automation and Brain Research , 1987, IEEE Transactions on Systems, Man, and Cybernetics.
[6] Frank Fallside,et al. Dynamic reinforcement driven error propagation networks with application to game playing , 1989 .
[7] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[8] Jürgen Schmidhuber,et al. The neural bucket brigade , 1989 .