Kernel PCA for Feature Extraction and De-Noising in Nonlinear Regression

In this paper, we propose the application of the Kernel Principal Component Analysis (PCA) technique for feature selection in a high-dimensional feature space, where input variables are mapped by a Gaussian kernel. The extracted features are employed in the regression problems of chaotic Mackey–Glass time-series prediction in a noisy environment and estimating human signal detection performance from brain event-related potentials elicited by task relevant signals. We compared results obtained using either Kernel PCA or linear PCA as data preprocessing steps. On the human signal detection task, we report the superiority of Kernel PCA feature extraction over linear PCA. Similar to linear PCA, we demonstrate de-noising of the original data by the appropriate selection of various nonlinear principal components. The theoretical relation and experimental comparison of Kernel Principal Components Regression, Kernel Ridge Regression and ε-insensitive Support Vector Regression is also provided.

[1]  Tomaso Poggio,et al.  A Unified Framework for Regularization Networks and Support Vector Machines , 1999 .

[2]  R. Tibshirani,et al.  The Covariance Inflation Criterion for Adaptive Model Selection , 1999 .

[3]  Elizabeth A. Peck,et al.  Introduction to Linear Regression Analysis , 2001 .

[4]  Bernhard Schölkopf,et al.  Generalization Performance of Regularization Networks and Support Vector Machines via Entropy Numbers of Compact Operators , 1998 .

[5]  A. Kramer,et al.  Event-related potentials as indices of display-monitoring performance , 1995, Biological Psychology.

[6]  Vladimir Vapnik,et al.  The Nature of Statistical Learning , 1995 .

[7]  Samy Bengio,et al.  SVMTorch: Support Vector Machines for Large-Scale Regression Problems , 2001, J. Mach. Learn. Res..

[8]  Christopher K. I. Williams Prediction with Gaussian Processes: From Linear Regression to Linear Prediction and Beyond , 1999, Learning in Graphical Models.

[9]  Kevin Warwick,et al.  Computer Intensive Methods in Control and Signal Processing: The Curse of Dimensionality , 1997 .

[10]  Tomaso A. Poggio,et al.  Regularization Networks and Support Vector Machines , 2000, Adv. Comput. Math..

[11]  J. Friedman,et al.  A Statistical View of Some Chemometrics Regression Tools , 1993 .

[12]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[13]  Bernhard Schölkopf,et al.  The connection between regularization operators and support vector kernels , 1998, Neural Networks.

[14]  J. Friedman,et al.  [A Statistical View of Some Chemometrics Regression Tools]: Response , 1993 .

[15]  Alexander Gammerman,et al.  Ridge Regression Learning Algorithm in Dual Variables , 1998, ICML.

[16]  F. Girosi,et al.  From regularization to radial, tensor and additive splines , 1993, Proceedings of 1993 International Conference on Neural Networks (IJCNN-93-Nagoya, Japan).

[17]  Johan A. K. Suykens,et al.  Sparse approximation using least squares support vector machines , 2000, 2000 IEEE International Symposium on Circuits and Systems. Emerging Technologies for the 21st Century. Proceedings (IEEE Cat No.00CH36353).

[18]  A. Cichocki,et al.  Kernel Principal Component Regression with EM Approach to Nonlinear Principal Components Extraction , 2001 .

[19]  Roman Rosipal,et al.  Kernel PCA Feature Extraction of Event-Related Potentials for Human Signal Detection Performance , 2000, ANNIMAB.

[20]  Nello Cristianini,et al.  An introduction to Support Vector Machines , 2000 .

[21]  M. Köppen,et al.  The Curse of Dimensionality , 2010 .

[22]  Jammalamadaka Introduction to Linear Regression Analysis (3rd ed.) , 2003 .

[23]  Roman Rosipal,et al.  An Expectation-Maximization Approach to Nonlinear Component Analysis , 2001, Neural Computation.

[24]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[25]  Leonard J. Trejo,et al.  Feature Extraction of Event-Related Potentials Using Wavelets: An Application to Human Performance Monitoring , 1999, Brain and Language.

[26]  I. Jolliffe A Note on the Use of Principal Components in Regression , 1982 .

[27]  N LEVIN On regression. , 1954, Delaware medical journal.