Adaptive Learning-Based $k$ -Nearest Neighbor Classifiers With Resilience to Class Imbalance

The classification accuracy of a <inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula>-nearest neighbor (<inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula>NN) classifier is largely dependent on the choice of the number of nearest neighbors denoted by <inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula>. However, given a data set, it is a tedious task to optimize the performance of <inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula>NN by tuning <inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula>. Moreover, the performance of <inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula>NN degrades in the presence of class imbalance, a situation characterized by disparate representation from different classes. We aim to address both the issues in this paper and propose a variant of <inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula>NN called the Adaptive <inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula>NN (Ada-<inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula>NN). The Ada-<inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula>NN classifier uses the density and distribution of the neighborhood of a test point and learns a suitable point-specific <inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula> for it with the help of artificial neural networks. We further improve our proposal by replacing the neural network with a heuristic learning method guided by an indicator of the local density of a test point and using information about its neighboring training points. The proposed heuristic learning algorithm preserves the simplicity of <inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula>NN without incurring serious computational burden. We call this method Ada-<inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula>NN2. Ada-<inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula>NN and Ada-<inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula>NN2 perform very competitive when compared with <inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula>NN, five of <inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula>NN’s state-of-the-art variants, and other popular classifiers. Furthermore, we propose a class-based global weighting scheme (Global Imbalance Handling Scheme or GIHS) to compensate for the effect of class imbalance. We perform extensive experiments on a wide variety of data sets to establish the improvement shown by Ada-<inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula>NN and Ada-<inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula>NN2 using the proposed GIHS, when compared with <inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula>NN, and its 12 variants specifically tailored for imbalanced classification.

[1]  Xiuzhen Zhang,et al.  A Positive-biased Nearest Neighbour Algorithm for Imbalanced Classification , 2013, PAKDD.

[2]  A. Ghosh On optimum choice of k in nearest neighbor classification , 2006 .

[3]  Gautam Bhattacharya,et al.  A probabilistic framework for dynamic k estimation in kNN classifiers with certainty factor , 2015, 2015 Eighth International Conference on Advances in Pattern Recognition (ICAPR).

[4]  Bartosz Krawczyk,et al.  Learning from imbalanced data: open challenges and future directions , 2016, Progress in Artificial Intelligence.

[5]  Yannis Manolopoulos,et al.  Adaptive k-Nearest-Neighbor Classification Using a Dynamic Number of Nearest Neighbors , 2007, ADBIS.

[6]  Dimitrios Gunopulos,et al.  Locally Adaptive Metric Nearest-Neighbor Classification , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Songbo Tan,et al.  Neighbor-weighted K-nearest neighbor for unbalanced text corpus , 2005, Expert Syst. Appl..

[8]  José Carlos Príncipe,et al.  Nearest Neighbor Distributions for imbalanced classification , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).

[9]  Bikas K. Chakrabarti,et al.  The mean distance to the nth neighbour in a uniform distribution of random points: an application of probability theory , 2002 .

[10]  Vikram Pudi,et al.  Class Based Weighted K-Nearest Neighbor over Imbalance Dataset , 2013, PAKDD.

[11]  M. Maloof Learning When Data Sets are Imbalanced and When Costs are Unequal and Unknown , 2003 .

[12]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[13]  Bhavani M. Thuraisingham,et al.  An Effective Evidence Theory Based K-Nearest Neighbor (KNN) Classification , 2008, 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[14]  Charu C. Aggarwal,et al.  On the Surprising Behavior of Distance Metrics in High Dimensional Spaces , 2001, ICDT.

[15]  Wei Liu,et al.  Class Confidence Weighted kNN Algorithms for Imbalanced Data Sets , 2011, PAKDD.

[16]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[17]  Zahir Tari,et al.  KRNN: k Rare-class Nearest Neighbour classification , 2017, Pattern Recognit..

[18]  Stephen Kwek,et al.  Applying Support Vector Machines to Imbalanced Datasets , 2004, ECML.

[19]  Nicolás García-Pedrajas,et al.  A Proposal for Local $k$ Values for $k$ -Nearest Neighbor Rule , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[20]  Martin Fodslette Meiller A Scaled Conjugate Gradient Algorithm for Fast Supervised Learning , 1993 .

[21]  Bo Tang,et al.  ENN: Extended Nearest Neighbor Method for Pattern Recognition [Research Frontier] , 2015, IEEE Computational Intelligence Magazine.

[22]  Stan Matwin,et al.  Addressing the Curse of Imbalanced Training Sets: One-Sided Selection , 1997, ICML.

[23]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[24]  Nitesh V. Chawla,et al.  Learning from Imbalanced Data: Evaluation Matters , 2012 .

[25]  Mark Goadrich,et al.  The relationship between Precision-Recall and ROC curves , 2006, ICML.

[26]  Masaki Nakagawa,et al.  Evaluation of prototype learning algorithms for nearest-neighbor classifier in application to handwritten character recognition , 2001, Pattern Recognit..

[27]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[28]  B. Park,et al.  Choice of neighbor order in nearest-neighbor classification , 2008, 0810.5276.

[29]  Nitesh V. Chawla,et al.  Consequences of Variability in Classifier Performance Estimates , 2010, 2010 IEEE International Conference on Data Mining.

[30]  C. Quesenberry,et al.  A nonparametric estimate of a multivariate density function , 1965 .

[31]  Luís Torgo,et al.  A Survey of Predictive Modeling on Imbalanced Domains , 2016, ACM Comput. Surv..

[32]  Leon N. Cooper,et al.  Improving nearest neighbor rule with a simple adaptive distance measure , 2007, Pattern Recognit. Lett..

[33]  Andrew K. C. Wong,et al.  Classification of Imbalanced Data: a Review , 2009, Int. J. Pattern Recognit. Artif. Intell..

[34]  Thomas G. Dietterich,et al.  Locally Adaptive Nearest Neighbor Algorithms , 1993, NIPS.

[35]  Haibo He,et al.  MCENN: A variant of extended nearest neighbor method for pattern recognition , 2020, Pattern Recognit. Lett..

[36]  Xiuzhen Zhang,et al.  Improving k Nearest Neighbor with Exemplar Generalization for Imbalanced Classification , 2011, PAKDD.

[37]  David A. Bell,et al.  Extended k-Nearest Neighbours based on Evidence Theory , 2004, Comput. J..

[38]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[39]  David J. Hand,et al.  A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems , 2001, Machine Learning.

[40]  Francisco Herrera,et al.  An insight into classification with imbalanced data: Empirical results and current trends on using data intrinsic characteristics , 2013, Inf. Sci..

[41]  Nathalie Japkowicz,et al.  The Class Imbalance Problem: Significance and Strategies , 2000 .

[42]  Khalid Zenkouar,et al.  A new nearest neighbor classification method based on fuzzy set theory and aggregation operators , 2017, Expert Syst. Appl..

[43]  C. Holmes,et al.  A probabilistic nearest neighbour method for statistical pattern recognition , 2002 .

[44]  Gautam Bhattacharya,et al.  An affinity-based new local distance function and similarity measure for kNN algorithm , 2012, Pattern Recognit. Lett..

[45]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[46]  Hanan Samet,et al.  Distance browsing in spatial databases , 1999, TODS.

[47]  María José del Jesús,et al.  KEEL 3.0: An Open Source Software for Multi-Stage Analysis in Data Mining , 2017, Int. J. Comput. Intell. Syst..