Efficient computation of the Fisher information matrix in the EM algorithm

The expectation-maximization (EM) algorithm is an iterative computational method to calculate the maximum likelihood estimators (MLEs) from the sample data. When the MLE is available, we naturally want the Fisher information matrix (FIM) of unknown parameters. However, one of the limitations of the EM algorithm is that the FIM is not an automatic by-product of the algorithm. In this paper, we construct a simple Monte Carlo-based method requiring only the gradient values of the function we obtain from the E step and basic operations. The key part of our method is to utilize the simultaneous perturbation stochastic approximation method to estimate the Hessian matrix from the gradient of the conditional expectation of the complete-data log-likelihood function.

[1]  Lennart Ljung,et al.  System Identification: Theory for the User , 1987 .

[2]  J. Spall Multivariate stochastic approximation using a simultaneous perturbation gradient approximation , 1992 .

[3]  James C Spall,et al.  Factorial Design for Efficient Experimentation , 2010, IEEE Control Systems.

[4]  GENERATING INFORMATIVE DATA FOR SYSTEM IDENTIFICATION , 2010 .

[5]  New York Dover,et al.  ON THE CONVERGENCE PROPERTIES OF THE EM ALGORITHM , 1983 .

[6]  J. Spall Adaptive stochastic approximation by the simultaneous perturbation method , 1998, Proceedings of the 37th IEEE Conference on Decision and Control (Cat. No.98CH36171).

[7]  B. Efron,et al.  Assessing the accuracy of the maximum likelihood estimator: Observed versus expected Fisher information , 1978 .

[8]  Lingyao Meng,et al.  Method for Computation of the Fisher Information Matrix in the Expectation-Maximization Algorithm , 2016, 1608.01734.

[9]  D. Oakes Direct calculation of the information matrix via the EM , 1999 .

[10]  W. Welch,et al.  Fisher information and maximum‐likelihood estimation of covariance parameters in Gaussian stochastic processes , 1998 .

[11]  T. Louis Finding the Observed Information Matrix When Using the EM Algorithm , 1982 .

[12]  James C. Spall,et al.  Relative performance of expected and observed fisher information in covariance estimation for maximum likelihood estimates , 2012, 2012 American Control Conference (ACC).

[13]  F. Vaida PARAMETER CONVERGENCE FOR EM AND MM ALGORITHMS , 2005 .

[14]  James C. Spall,et al.  Identification for Systems With Binary Subsystems , 2014, IEEE Transactions on Automatic Control.

[15]  Xiao-Li Meng,et al.  Using EM to Obtain Asymptotic Variance-Covariance Matrices: The SEM Algorithm , 1991 .

[16]  Luc Pronzato,et al.  Optimal experimental design and some related control problems , 2008, Autom..

[17]  J. Spall Monte Carlo Computation of the Fisher Information Matrix in Nonstandard Settings , 2005 .

[18]  G. McLachlan,et al.  The EM algorithm and extensions , 1996 .

[19]  Maya R. Gupta,et al.  Theory and Use of the EM Algorithm , 2011, Found. Trends Signal Process..

[20]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .