Finite Sample Error Bound for Parzen Windows

Parzen Windows as a nonparametric method has been applied to a variety of density estimation as well as classification problems. Similar to nearest neighbor methods. Parzen Windows does not involve learning. While it converges to true but unknown probability densities in the asymptotic limit, there is a lack of theoretical analysis on its performance with finite samples. In this paper we establish a finite sample error bound for Parzen Windows. We first show that Parzen Windows is an approximation to regularized least squares (RLS) methods that have been well studied in statistical learning theory. We then derive the finite sample error bound for Parzen Windows, and discuss the properties of the error bound and its relationship to the error bound for RLS. This analysis provides interesting insight to Parzen Windows as well as the nearest neighbor method from the point of view of learning theory. Finally, we provide empirical results on the performance of Parzen Windows and other methods such as nearest neighbors, RLS and SVMs on a number of real data sets. These results corroborate well our theoretical analysis.

[1]  David G. Lowe,et al.  Similarity Metric Learning for a Variable-Kernel Classifier , 1995, Neural Computation.

[2]  Dimitrios Gunopulos,et al.  Locally Adaptive Metric Nearest-Neighbor Classification , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Lorenzo Rosasco,et al.  Model Selection for Regularized Least-Squares Algorithm in Learning Theory , 2005, Found. Comput. Math..

[4]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[5]  David G. Stork,et al.  Pattern classification, 2nd Edition , 2000 .

[6]  Keinosuke Fukunaga,et al.  Introduction to statistical pattern recognition (2nd ed.) , 1990 .

[7]  Robert Tibshirani,et al.  Discriminant Adaptive Nearest Neighbor Classification , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  T. Poggio,et al.  The Mathematics of Learning: Dealing with Data , 2005, 2005 International Conference on Neural Networks and Brain.

[9]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[10]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[11]  Nello Cristianini,et al.  An introduction to Support Vector Machines , 2000 .

[12]  A. N. Tikhonov,et al.  Solutions of ill-posed problems , 1977 .

[13]  Tomaso A. Poggio,et al.  Regularization Networks and Support Vector Machines , 2000, Adv. Comput. Math..

[14]  A. E. Hoerl,et al.  Ridge regression: biased estimation for nonorthogonal problems , 2000 .

[15]  Anders Krogh,et al.  Introduction to the theory of neural computation , 1994, The advanced book program.

[16]  E. Parzen On Estimation of a Probability Density Function and Mode , 1962 .

[17]  Octavia I. Camps,et al.  Weighted Parzen Windows for Pattern Classification , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[19]  D Haussler,et al.  Knowledge-based analysis of microarray gene expression data by using support vector machines. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[20]  Chong-Ho Choi,et al.  Input Feature Selection by Mutual Information Based on Parzen Window , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[21]  S. Smale The Mathematics of Learning: Dealing With Data , 2005, 2005 International Conference on Neural Networks and Brain.

[22]  Jing Peng,et al.  Adaptive quasiconformal kernel nearest neighbor classification , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Michael I. Jordan,et al.  Dimensionality Reduction for Supervised Learning with Reproducing Kernel Hilbert Spaces , 2004, J. Mach. Learn. Res..

[24]  Jerome H. Friedman,et al.  Flexible Metric Nearest Neighbor Classification , 1994 .

[25]  Adam Krzyzak,et al.  A Distribution-Free Theory of Nonparametric Regression , 2002, Springer series in statistics.

[26]  Dimitrios Gunopulos,et al.  An Adaptive Metric Machine for Pattern Classification , 2000, NIPS.

[27]  Felipe Cucker,et al.  On the mathematical foundations of learning , 2001 .

[28]  Felipe Cucker,et al.  Best Choices for Regularization Parameters in Learning Theory: On the Bias—Variance Problem , 2002, Found. Comput. Math..