Training neural networks using sequential extended Kalman filtering

Recent work has demonstrated the use of the extended Kalman filter (EKF) as an alternative to gradient-descent backpropagation when training multi-layer perceptrons. The EKF approach significantly improves convergence properties but at the cost of greater storage and computational complexity. Feldkamp et al. have described a decoupled version of the EKF which preserves the training advantages of the general EKF but which reduces the storage and computational requirements. This paper reviews the general and decoupled EKF approaches and presents sequentialized versions which provide further computational savings over the batch forms. The usefulness of the sequentialized EKF algorithms is demonstrated on a pattern classification problem.