Empirical comparison of robustness of classifiers on IR imagery

Many classifiers have been proposed for ATR applications. Given a set of training data, a classifier is built from the labeled training data, and then applied to predict the label of a new test point. If there is enough training data, and the test points are drawn from the same distribution (i.i.d.) as training data, then many classifiers perform quite well. However, in reality, there will never be enough training data or with limited computational resources we can only use part of the training data. Likewise, the distribution of new test points might be different from that of the training data, whereby the training data is not representative of the test data. In this paper, we empirically compare several classifiers, namely support vector machines, regularized least squares classifiers, C4.4, C4.5, random decision trees, bagged C4.4, and bagged C4.5 on IR imagery. We reduce the training data by half (less representative of the test data) each time and evaluate the resulting classifiers on the test data. This allows us to assess the robustness of classifiers against a varying knowledge base. A robust classifier is the one whose accuracy is the least sensitive to changes in the training data. Our results show that ensemble methods (random decision trees, bagged C4.4 and bagged C4.5) outlast single classifiers as the training data size decreases.

[1]  R.P. Lippmann,et al.  Pattern classification using neural networks , 1989, IEEE Communications Magazine.

[2]  Pedro M. Domingos,et al.  Tree Induction for Probability-Based Ranking , 2003, Machine Learning.

[3]  Teuvo Kohonen,et al.  The self-organizing map , 1990 .

[4]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[5]  S. Richard F. Sims Putting ATR performance on an equal basis: the measurement of knowledge base distortion and relevant clutter , 2000, SPIE Defense + Commercial Sensing.

[6]  Shree K. Nayar,et al.  Pattern rejection , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[7]  Felipe Cucker,et al.  On the mathematical foundations of learning , 2001 .

[8]  T. Poggio,et al.  The Mathematics of Learning: Dealing with Data , 2005, 2005 International Conference on Neural Networks and Brain.

[9]  Trevor J. Hastie,et al.  Discriminative vs Informative Learning , 1997, KDD.

[10]  Philip S. Yu,et al.  Is random model better? On its accuracy and efficiency , 2003, Third IEEE International Conference on Data Mining.

[11]  Michael Elad,et al.  Optimal reduced-rank quadratic classifiers using the Fukunaga-Koontz transform with applications to automated target recognition , 2003, SPIE Defense + Commercial Sensing.

[12]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[13]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[14]  Don Coppersmith,et al.  On the Asymptotic Complexity of Matrix Multiplication , 1982, SIAM J. Comput..