Comparing Classification Methods for Longitudinal fMRI Studies

We compare 10 methods of classifying fMRI volumes by applying them to data from a longitudinal study of stroke recovery: adaptive Fisher's linear and quadratic discriminant; gaussian naive Bayes; support vector machines with linear, quadratic, and radial basis function (RBF) kernels; logistic regression; two novel methods based on pairs of restricted Boltzmann machines (RBM); and K-nearest neighbors. All methods were tested on three binary classification tasks, and their out-of-sample classification accuracies are compared. The relative performance of the methods varies considerably across subjects and classification tasks. The best overall performers were adaptive quadratic discriminant, support vector machines with RBF kernels, and generatively trained pairs of RBMs.

[1]  Tafsir Thiam,et al.  The Boltzmann machine , 1999, IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339).

[2]  L. K. Hansen,et al.  Multivariate strategies in functional magnetic resonance imaging , 2007, Brain and Language.

[3]  S. C. Strother,et al.  The Quantitative Evaluation of Functional Neuroimaging Experiments: Mutual Information Learning Curves , 2002, NeuroImage.

[4]  Stephen José Hanson,et al.  Combinatorial codes in ventral temporal lobe for object recognition: Haxby (2001) revisited: is there a “face” area? , 2004, NeuroImage.

[5]  C. Genovese,et al.  Cerebellar hemispheric activation ipsilateral to the paretic hand correlates with functional recovery after stroke. , 2002, Brain : a journal of neurology.

[6]  Lars Kai Hansen,et al.  Nonlinear versus Linear Models in Functional Neuroimaging: Learning Curves and Generalization Crossover , 1997, IPMI.

[7]  Tom M. Mitchell,et al.  Machine learning classifiers and fMRI: A tutorial overview , 2009, NeuroImage.

[8]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[9]  William H. Press,et al.  Numerical Recipes in C, 2nd Edition , 1992 .

[10]  Karl J. Friston,et al.  Statistical Parametric Mapping: The Analysis of Functional Brain Images , 2007 .

[11]  Luiz Pessoa,et al.  Target visibility and visual awareness modulate amygdala responses to fearful faces. , 2006, Cerebral cortex.

[12]  Lars Kai Hansen,et al.  Model sparsity and brain pattern interpretation of classification models in neuroimaging , 2012, Pattern Recognit..

[13]  Geoffrey E. Hinton,et al.  To recognize shapes, first learn to generate images. , 2007, Progress in brain research.

[14]  William H. Press,et al.  Numerical recipes in C , 2002 .

[15]  Michael I. Jordan,et al.  On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes , 2001, NIPS.

[16]  Thomas Hofmann,et al.  Greedy Layer-Wise Training of Deep Networks , 2007 .

[17]  Lars Kai Hansen,et al.  Optimizing the fMRI data-processing pipeline using prediction and reproducibility performance metrics: I. A preliminary group analysis , 2004, NeuroImage.

[18]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[19]  Stephen C. Strother,et al.  Support vector machines for temporal classification of block design fMRI data , 2005, NeuroImage.

[20]  Jing Zhang,et al.  A Java-based fMRI Processing Pipeline Evaluation System for Assessment of Univariate General Linear Model and Multivariate Canonical Variate Analysis-based Pipelines , 2008, Neuroinformatics.

[21]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[22]  Alice J. O'Toole,et al.  Theoretical, Statistical, and Practical Perspectives on Pattern-based Classification Approaches to the Analysis of Functional Neuroimaging Data , 2007, Journal of Cognitive Neuroscience.

[23]  Xu Chen,et al.  Dimensionality estimation for optimal detection of functional networks in BOLD fMRI data , 2011, NeuroImage.

[24]  Xu Chen,et al.  Bayesian Kernel Methods for Analysis of Functional Neuroimages , 2007, IEEE Transactions on Medical Imaging.

[25]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[26]  Masa-aki Sato,et al.  Sparse estimation automatically selects voxels relevant for the decoding of fMRI activity patterns , 2008, NeuroImage.

[27]  M. F. Fuller,et al.  Practical Nonparametric Statistics; Nonparametric Statistical Inference , 1973 .

[28]  S. Strother,et al.  Penalized discriminant analysis of [/sup 15/O]-water PET brain images with prediction error selection of smoothness and regularization hyperparameters , 2001, IEEE Transactions on Medical Imaging.

[29]  Rainer Goebel,et al.  Combining multivariate voxel selection and support vector machines for mapping and classification of fMRI spatial patterns , 2008, NeuroImage.

[30]  Geoffrey E. Hinton,et al.  Generative versus discriminative training of RBMs for classification of fMRI images , 2008, NIPS.

[31]  R. Murray,et al.  Pattern of neural responses to verbal fluency shows diagnostic specificity for schizophrenia and bipolar disorder , 2011, BMC psychiatry.

[32]  Stephen C. Strother,et al.  Penalized Discriminant Analysis of [15O]-water PET Brain Images with Prediction Error Selection of Smoothness and Regularization , 2001, IEEE Trans. Medical Imaging.

[33]  G. Rees,et al.  Predicting the orientation of invisible stimuli from activity in human primary visual cortex , 2005, Nature Neuroscience.

[34]  Yoshua Bengio,et al.  Classification using discriminative restricted Boltzmann machines , 2008, ICML '08.

[35]  Geoffrey E. Hinton,et al.  Exponential Family Harmoniums with an Application to Information Retrieval , 2004, NIPS.

[36]  A. Ravishankar Rao,et al.  Prediction and interpretation of distributed neural activity with sparse models , 2009, NeuroImage.

[37]  A. Ishai,et al.  Distributed and Overlapping Representations of Faces and Objects in Ventral Temporal Cortex , 2001, Science.

[38]  S.C. Strother,et al.  Evaluating fMRI preprocessing pipelines , 2006, IEEE Engineering in Medicine and Biology Magazine.

[39]  Karl J. Friston,et al.  Statistical parametric mapping , 2013 .

[40]  R. Tibshirani,et al.  Penalized Discriminant Analysis , 1995 .

[41]  Sean M. Polyn,et al.  Beyond mind-reading: multi-voxel pattern analysis of fMRI data , 2006, Trends in Cognitive Sciences.

[42]  Emile H. L. Aarts,et al.  Boltzmann machines , 1998 .

[43]  Tom M. Mitchell,et al.  Learning to Decode Cognitive States from Brain Images , 2004, Machine Learning.

[44]  Yaroslav O. Halchenko,et al.  Brain Reading Using Full Brain Support Vector Machines for Object Recognition: There Is No Face Identification Area , 2008, Neural Computation.

[45]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[46]  Lars Kai Hansen,et al.  Generalizable Singular Value Decomposition for Ill-posed Datasets , 2000, NIPS.

[47]  Anthony Randal McIntosh,et al.  Partial least squares analysis of neuroimaging data: applications and advances , 2004, NeuroImage.

[48]  Scott T. Grafton,et al.  Automated image registration: I. General methods and intrasubject, intramodality validation. , 1998, Journal of computer assisted tomography.