Generalization of a Result of Fabian on the Asymptotic Normality of Stochastic Approximation

Stochastic approximation (SA) is a general framework for analyzing the convergence of a large collection of stochastic root-finding algorithms. The Kiefer–Wolfowitz and stochastic gradient algorithms are two well-known (and widely used) examples of SA. Because of their applicability to a wide range of problems, many results have been obtained regarding the convergence properties of SA procedures. One important reference in the literature, Fabian (1968), derives general conditions for the asymptotic normality of the SA iterates. Since then, many results regarding asymptotic normality of SA procedures have relied heavily on Fabian’s theorem. Unfortunately, some of the assumptions of Fabian’s result are not applicable to some modern implementations of SA in control and learning. In this paper we explain the nature of this incompatibility and show how Fabian’s theorem can be generalized to address the issue.

[1]  J. Blum Multidimensional Stochastic Approximation Methods , 1954 .

[2]  Jiaqiao Hu,et al.  Gradient-Based Adaptive Stochastic Search for Non-Differentiable Optimization , 2013, IEEE Transactions on Automatic Control.

[3]  H. Vincent Poor,et al.  Distributed Linear Parameter Estimation: Asymptotically Efficient Adaptive Strategies , 2011, SIAM J. Control. Optim..

[4]  V. Fabian On Asymptotic Normality in Stochastic Approximation , 1968 .

[5]  Yurii Nesterov,et al.  Efficiency of Coordinate Descent Methods on Huge-Scale Optimization Problems , 2012, SIAM J. Optim..

[6]  Harold J. Kushner,et al.  Stochastic Approximation Algorithms and Applications , 1997, Applications of Mathematics.

[7]  H. Robbins A Stochastic Approximation Method , 1951 .

[8]  V. Fabian Stochastic Approximation of Minima with Improved Asymptotic Speed , 1967 .

[9]  James C. Spall,et al.  Adaptive stochastic approximation by the simultaneous perturbation method , 2000, IEEE Trans. Autom. Control..

[10]  J. Spall Multivariate stochastic approximation using a simultaneous perturbation gradient approximation , 1992 .

[11]  Ping Hu,et al.  A Stochastic Approximation Framework for a Class of Randomized Optimization Algorithms , 2012, IEEE Transactions on Automatic Control.

[12]  John N. Tsitsiklis,et al.  Asynchronous Stochastic Approximation and Q-Learning , 1994, Machine Learning.

[13]  Stephen S. Wilson,et al.  Random iterative models , 1996 .

[14]  Pierre Priouret,et al.  Adaptive Algorithms and Stochastic Approximations , 1990, Applications of Mathematics.

[15]  Harold J. Kushner,et al.  wchastic. approximation methods for constrained and unconstrained systems , 1978 .

[16]  L. A. Prashanth,et al.  Stochastic Recursive Algorithms for Optimization: Simultaneous Perturbation Methods , 2012 .

[17]  Charles R. Johnson,et al.  Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[18]  James C. Spall,et al.  Introduction to Stochastic Search and Optimization. Estimation, Simulation, and Control (Spall, J.C. , 2007 .

[19]  Mikhail Borisovich Nevelʹson,et al.  Stochastic Approximation and Recursive Estimation , 1976 .

[20]  M Noble,et al.  Estimation problems associated with stochastic modeling of proliferation and differentiation of O-2A progenitor cells in vitro. , 2000, Mathematical biosciences.