Caveats with Stochastic Gradient and Maximum Likelihood Based ICA for EEG

Stochastic gradient (SG) is the most commonly used optimization technique for maximum likelihood based approaches to independent component analysis (ICA). It is in particular the default solver in public implementations of Infomax and variants. Motivated by experimental findings on electroencephalography (EEG) data, we report some caveats which can impact the results and interpretation of neuroscience findings. We investigate issues raised by controlling the step size in gradient updates combined with early stopping conditions, as well as initialization choices which can artificially generate biologically plausible brain sources, so called dipolar sources. We provide experimental evidence that pushing the convergence of Infomax using non stochastic solvers can reduce the number of highly dipolar components and provide a mathematical explanation of this fact. Results are presented on public EEG data.

[1]  L. Goddard Information Theory , 1962, Nature.

[2]  J. Fermaglich Electric Fields of the Brain: The Neurophysics of EEG , 1982 .

[3]  Christian Jutten,et al.  Blind separation of sources, part I: An adaptive algorithm based on neuromimetic architecture , 1991, Signal Process..

[4]  Terrence J. Sejnowski,et al.  An Information-Maximization Approach to Blind Separation and Blind Deconvolution , 1995, Neural Computation.

[5]  Tzyy-Ping Jung,et al.  Independent Component Analysis of Electroencephalographic Data , 1995, NIPS.

[6]  Jean-François Cardoso,et al.  Equivariant adaptive source separation , 1996, IEEE Trans. Signal Process..

[7]  J. Cardoso Infomax and maximum likelihood for blind source separation , 1997, IEEE Signal Processing Letters.

[8]  Philippe Garat,et al.  Blind separation of mixture of independent sources through a quasi-maximum likelihood approach , 1997, IEEE Trans. Signal Process..

[9]  Shun-ichi Amari,et al.  Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.

[10]  T. Ens,et al.  Blind signal separation : statistical principles , 1998 .

[11]  Yonina C. Eldar,et al.  MMSE whitening and subspace whitening , 2003, IEEE Transactions on Information Theory.

[12]  Arnaud Delorme,et al.  EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics including independent component analysis , 2004, Journal of Neuroscience Methods.

[13]  Arnaud Delorme,et al.  Frontal midline EEG dynamics during working memory , 2005, NeuroImage.

[14]  P. Berg,et al.  Use of prior knowledge in brain electromagnetic source analysis , 2005, Brain Topography.

[15]  Léon Bottou,et al.  Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.

[16]  R. Oostenveld,et al.  Independent EEG Sources Are Dipolar , 2012, PloS one.

[17]  Tom Schaul,et al.  No more pesky learning rates , 2012, ICML.

[18]  Martin Luessi,et al.  MNE software for processing MEG and EEG data , 2014, NeuroImage.

[19]  K. Schittkowski,et al.  NONLINEAR PROGRAMMING , 2022 .