论文信息 - Comparison of reinforcement algorithms on discrete functions: learnability, time complexity, and scaling

Comparison of reinforcement algorithms on discrete functions: learnability, time complexity, and scaling

The authors compare the performances of a variety of algorithms in a reinforcement learning paradigm, including Ar-p, Ar-i, reinforcement-comparison (plus a new variation), and backpropagation of reinforcement gradient through a forward model. The task domain is discrete multioutput functions. Performance is measured in terms of learnability, training time, and scaling. Ar-p outperforms all others and scales well relative to supervised backpropagation. An ergodic variant of reinforcement-comparison approaches Ar-p performance. For the tasks studied, total training time (including model and controller) for the forward model algorithm is 1 to 2 orders of magnitude more costly than for Ar-p, and the controller's success is sensitive to forward model accuracy. Distortions of the reinforcement gradient predicted by an inaccurate forward model cause the controller's failures.<<ETX>>

Michael C. Mozer | K. L. Markey

[1] Michael I. Jordan,et al. Forward Models: Supervised Learning with a Distal Teacher , 1992, Cogn. Sci..

[2] A G Barto,et al. Learning by statistical cooperation of self-interested neuron-like computing elements. , 1985, Human neurobiology.

[3] R. J. Williams,et al. On the use of backpropagation in associative reinforcement learning , 1988, IEEE 1988 International Conference on Neural Networks.

[4] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .

[5] Esther Levin,et al. Accelerated Learning in Layered Neural Networks , 1988, Complex Syst..

[6] Charles W. Anderson,et al. Strategy Learning with Multilayer Connectionist Representations , 1987 .

[7] P. Anandan,et al. Pattern-recognizing stochastic learning automata , 1985, IEEE Transactions on Systems, Man, and Cybernetics.

[8] J. Peng,et al. Reinforcement learning algorithms as function optimizers , 1989, International 1989 Joint Conference on Neural Networks.