Cyclic seesaw optimization and identification

In the seesaw (or cyclic or alternating) method for optimization and identification, the full parameter vector is divided into two or more subvectors and the process proceeds by sequentially optimizing each of the subvectors while holding the remaining parameters at their most recent values. One advantage of the scheme is the preservation of large investments in software while allowing for an extension of capability to include new parameters for estimation. A specific case involves cross-sectional data represented in state-space form, where there is interest in estimating the mean vector and covariance matrix of the initial state vector as well as parameters associated with the dynamics of the underlying differential equations. This paper shows that under reasonable conditions the cyclic scheme leads to parameter estimates that converge to the optimal joint value for the full vector of unknown parameters. Convergence conditions here differ from others in the literature. Further, relative to standard search methods on the full vector, numerical results here suggest a more general property of faster convergence as a consequence of the more “aggressive” (larger) gain coefficient (step size) possible in the seesaw algorithm.

[1]  Wanli Min,et al.  A Statistical Approach to Thermal Management of Data Centers Under Steady State and System Perturbations , 2010, Journal of the American Statistical Association.

[2]  James C. Spall,et al.  Feedback and Weighting Mechanisms for Improving Jacobian Estimates in the Adaptive Simultaneous Perturbation Algorithm , 2007, IEEE Transactions on Automatic Control.

[3]  Katta G. Murty,et al.  Nonlinear Programming Theory and Algorithms , 2007, Technometrics.

[4]  W. Achtziger On simultaneous optimization of truss geometry and topology , 2007 .

[5]  H. H. Rosenbrock,et al.  An Automatic Method for Finding the Greatest or Least Value of a Function , 1960, Comput. J..

[6]  P. Tseng Convergence of a Block Coordinate Descent Method for Nondifferentiable Minimization , 2001 .

[7]  R. Shumway,et al.  Estimation and tests of hypotheses for the initial mean and covariance in the kalman filter model , 1981 .

[8]  P. Caines,et al.  Linear system identification from nonstationary cross-sectional data , 1979 .

[9]  James C. Spall,et al.  Introduction to Stochastic Search and Optimization. Estimation, Simulation, and Control (Spall, J.C. , 2007 .

[10]  James C. Bezdek,et al.  Convergence of Alternating Optimization , 2003, Neural Parallel Sci. Comput..

[11]  Tim Hesterberg,et al.  Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control , 2004, Technometrics.

[12]  K. Schittkowski,et al.  NONLINEAR PROGRAMMING , 2022 .

[13]  P. Caines,et al.  Linear system identification from non-stationary cross-sectional data , 1978, 1978 IEEE Conference on Decision and Control including the 17th Symposium on Adaptive Processes.

[14]  James C. Spall,et al.  Introduction to stochastic search and optimization - estimation, simulation, and control , 2003, Wiley-Interscience series in discrete mathematics and optimization.

[15]  J.C. Spall th International Conference on Information Fusion ( FUSION ) Seesaw Method for Combining Parameter Estimates , 2006 .

[16]  Mokhtar S. Bazaraa,et al.  Nonlinear Programming: Theory and Algorithms, 3/E. , 2019 .

[17]  L. J. Levy Generic Maximum Likelihood Identification Algorithms for Linear State Space Models , 1995 .

[18]  James C. Spall,et al.  Parameter identification for state-space models with nuisance parameters , 1990 .