论文信息 - Reinforcement learning algorithms as function optimizers

Reinforcement learning algorithms as function optimizers

Any nonassociative reinforcement learning algorithm can be viewed as a method for performing function optimization through (possibly noise-corrupted) sampling of function values. A description is given of the results of simulations in which the optima of several deterministic functions studied by D.H. Ackley (Ph.D. Diss., Carnegie-Mellon Univ., 1987) were sought using variants of REINFORCE algorithms. Results obtained for certain of these algorithms compare favorably to the best results found by Ackley.<<ETX>>

J. Peng | R. J. Williams | Ronald J. Williams | R. J. Williams | J. Peng

[1] K. Narendra,et al. Learning AutomataA Survey , 1974 .

[2] Kumpati S. Narendra,et al. Learning Automata - A Survey , 1974, IEEE Trans. Syst. Man Cybern..

[3] John H. Holland,et al. Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[4] David S. Johnson,et al. Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[5] S. Lakshmivarahan,et al. Learning Algorithms Theory and Applications , 1981 .

[6] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[7] C. D. Gelatt,et al. Optimization by Simulated Annealing , 1983, Science.

[8] Richard S. Sutton,et al. Temporal credit assignment in reinforcement learning , 1984 .

[9] P. Anandan,et al. Pattern-recognizing stochastic learning automata , 1985, IEEE Transactions on Systems, Man, and Cybernetics.

[10] A G Barto,et al. Learning by statistical cooperation of self-interested neuron-like computing elements. , 1985, Human neurobiology.

[11] Geoffrey E. Hinton,et al. A Learning Algorithm for Boltzmann Machines , 1985, Cogn. Sci..

[12] D. Ackley. Stochastic iterated genetic hillclimbing , 1987 .

[13] D. Ackley. A connectionist machine for genetic hillclimbing , 1987 .