Matched-Pair Machine Learning

Following an analogous distinction in statistical hypothesis testing and motivated by chemical plume detection in hyperspectral imagery, we investigate machine-learning algorithms where the training set is comprised of matched pairs. We find that even conventional classifiers exhibit improved performance when the input data have a matched-pair structure, and we develop an example of a “dipole” algorithm to directly exploit this structured input. In some scenarios, matched pairs can be generated from independent samples, with the effect of not only doubling the nominal size of the training set, but of providing the matched-pair structure that leads to better learning. The creation of matched pairs from a dataset of interest also permits a kind of transductive learning, which is found for the plume detection problem to exhibit improved performance. Supplementary materials for this article are available online.

[1]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[2]  Beer Bestimmung der Absorption des rothen Lichts in farbigen Flüssigkeiten , 1852 .

[3]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[4]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[5]  H. Robbins A Stochastic Approximation Method , 1951 .

[6]  William N. Venables,et al.  Modern Applied Statistics with S , 2010 .

[7]  Yann LeCun,et al.  Large Scale Online Learning , 2003, NIPS.

[8]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[9]  Thomas G. Dietterich,et al.  Editors. Advances in Neural Information Processing Systems , 2002 .

[10]  David G. Stork,et al.  Pattern Classification , 1973 .

[11]  Dimitris Manolakis Signal processing algorithms for hyperspectral remote sensing of chemical plumes , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[12]  Sidney Addelman,et al.  trans-Dimethanolbis(1,1,1-trifluoro-5,5-dimethylhexane-2,4-dionato)zinc(II) , 2008, Acta crystallographica. Section E, Structure reports online.

[13]  D. Roberts,et al.  Estimation of aerosol optical depth and additional atmospheric parameters for the calculation of apparent reflectance from radiance measured by the Airborne Visible/Infrared Imaging Spectrometer , 1993 .

[14]  James Theiler,et al.  Decision boundaries in two dimensions for target detection in hyperspectral imagery. , 2009, Optics express.

[15]  A. Schaum A remedy for nonstationarity in background transition regions for real time hyperspectral detection , 2006, 2006 IEEE Aerospace Conference.

[16]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[17]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[18]  Wallace M. Porter,et al.  The airborne visible/infrared imaging spectrometer (AVIRIS) , 1993 .

[19]  A. Hayden,et al.  Determination of trace-gas amounts in plumes by the use of orthogonal digital filtering of thermal-emission spectra. , 1996, Applied optics.

[20]  J. Friedman Regularized Discriminant Analysis , 1989 .