Associative reinforcement learning of real-valued functions

The author describes an algorithm, called the stochastic real-valued (SRV) algorithm, that uses evaluative performance feedback to learn associative maps from input vectors to real-valued actions. This algorithm is based on the pioneering work of A.G. Barto and P. Anandan (1985), in synthesizing associative reinforcement learning (ARL) algorithms using techniques from pattern classification and automata theory. A strong convergence theorem is presented that implies a form of optimal performance under certain general conditions of the SRV algorithm on ARL tasks. Simulation results are presented to illustrate the convergence behavior of the algorithm under the conditions of the theorem. The robustness of the algorithm is also demonstrated by simulations in which some of the conditions of the theorem are violated.<<ETX>>

[1]  J. Wolfowitz On the Stochastic Approximation Method of Robbins and Monro , 1952 .

[2]  J. Kiefer,et al.  Stochastic Estimation of the Maximum of a Regression Function , 1952 .

[3]  W. A. Clark,et al.  Simulation of self-organizing systems by digital computer , 1954, Trans. IRE Prof. Group Inf. Theory.

[4]  J. Laurie Snell,et al.  Studies in mathematical learning theory. , 1960 .

[5]  George E. Ferris,et al.  An Introduction to Mathematical Learning Theory , 1966 .

[6]  B. Chandrasekaran,et al.  On Expediency and Convergence in Variable-Structure Automata , 1968, IEEE Trans. Syst. Sci. Cybern..

[7]  F. Downton Stochastic Approximation , 1969, Nature.

[8]  M. L. Tsetlin,et al.  Automaton theory and modeling of biological systems , 1973 .

[9]  E Harth,et al.  Alopex: a stochastic method for determining visual receptive fields. , 1974, Vision research.

[10]  Peter E. Hart,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[11]  Sheldon M. Ross,et al.  Stochastic Processes , 2018, Gauge Integral Structures for Stochastic Calculus and Quantum Electrodynamics.

[12]  P. Anandan,et al.  Pattern-recognizing stochastic learning automata , 1985, IEEE Transactions on Systems, Man, and Cybernetics.

[13]  Geoffrey E. Hinton,et al.  A Learning Algorithm for Boltzmann Machines , 1985, Cogn. Sci..

[14]  Robert B. Allen,et al.  Stochastic Learning Networks and their Electronic Implementation , 1987, NIPS.

[15]  V. Gullapalli A Stochastic Algorithm for Learning Real-valued Functions via Reinforcement , 1988 .

[16]  Kumpati S. Narendra,et al.  Learning automata - an introduction , 1989 .

[17]  Vijaykumar Gullapalli,et al.  A stochastic reinforcement learning algorithm for learning real-valued functions , 1990, Neural Networks.

[18]  V. Gullapalli Modeling cortical area 7a using Stochastic Real-Valued (SRV) units , 1991 .