Improved SPSA Using Complex Variables with Applications in Optimal Control Problems

Consider minimizing a general objective function when only the noisy function measurements are available. Such a problem has a broad range of applications in practice, including optimal control, operation research, and machine learning. Based on the simultaneous perturbation stochastic approximation (SPSA) algorithm and complex-step (CS) gradient approximation, this work proposes a new algorithm called the complex-step simultaneous perturbation stochastic approximation (CS-SPSA) algorithm. The proposed algorithm is shown to have inherited not only the high efficiency of SPSA for stochastic optimization problems, but also the superior accuracy and stability of CS gradient approximation for deterministic numerical algorithms when compared with the classic finite-difference (FD) method. Theoretical results show that the sequence of the estimates generated by CS-SPSA converges almost surely to the optimal point. An application of a data-driven linear-quadratic regulator (LQR) optimal control problem is demonstrated, which shows the successful performances of CS-SPSA compared with other algorithms.