Realisable Classifiers: Improving Operating Performance on Variable Cost Problems

A novel method is described for obtaining superior classification performance over a variable range of classification costs. By analysis of a set of existing classifiers using a receiver operating characteristic ( )c urve, a set of new realisable classifiers may be obtained by a random combination of two of the existing classifiers. These classifiers lie on the convex hull that contains the original points for the existing classifiers. This hull is the maximum realisable ( ). A theorem for this method is derived and proved from an observation about data, and experimental results verify that a superior classification system may be constructed using only the existing classifiers and the information of the original data. This new system is shown to produce the , and as such provides a powerful technique for improving classification systems in problem domains within which classification costs may not be known ap riori. Empirical results are presented for artificial data, and for two real world data sets: an image segmentation task and the diagnosis of abnormal thyroid condition.

[1]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[2]  Gerald S. Rogers,et al.  Mathematical Statistics: A Decision Theoretic Approach , 1967 .

[3]  Ron Kohavi,et al.  Supervised and Unsupervised Discretization of Continuous Features , 1995, ICML.

[4]  Josef Kittler,et al.  Floating search methods in feature selection , 1994, Pattern Recognit. Lett..

[5]  D R Lovell,et al.  Design, construction and evaluation of systems to predict risk in obstetrics. , 1997, International journal of medical informatics.

[6]  David J. Hand,et al.  Construction and Assessment of Classification Rules , 1997 .

[7]  David J. Spiegelhalter,et al.  Machine Learning, Neural and Statistical Classification , 2009 .

[8]  Mahesan Niranjan,et al.  Parcel: Feature Subset Selection in Variable Cost Domains , 1998 .

[9]  G DietterichThomas Approximate statistical tests for comparing supervised classification learning algorithms , 1998 .

[10]  Huan Liu,et al.  Book review: Machine Learning, Neural and Statistical Classification Edited by D. Michie, D.J. Spiegelhalter and C.C. Taylor (Ellis Horwood Limited, 1994) , 1996, SGAR.

[11]  Christopher J. Merz,et al.  UCI Repository of Machine Learning Databases , 1996 .

[12]  Joseph O'Rourke,et al.  Computational Geometry in C. , 1995 .

[13]  David P. Dobkin,et al.  The quickhull algorithm for convex hulls , 1996, TOMS.

[14]  Anil K. Jain,et al.  Feature Selection: Evaluation, Application, and Small Sample Performance , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  David Lovell,et al.  On the use of Expected Attainable Discrimination for feature selection in large scale medical risk p , 1997 .

[16]  J. O´Rourke,et al.  Computational Geometry in C: Arrangements , 1998 .