Boosting with Lexicographic Programming: Addressing Class Imbalance without Cost Tuning

A large amount of research effort has been dedicated to adapting boosting for imbalanced classification. However, boosting methods are yet to be satisfactorily immune to class imbalance, especially for multi-class problems. This is because most of the existing solutions for handling class imbalance rely on expensive cost set tuning for determining the proper level of compensation. We show that the assignment of weights to the component classifiers of a boosted ensemble can be thought of as a game of Tug of War between the classes in the margin space. We then demonstrate how this insight can be used to attain a good compromise between the rare and abundant classes without having to resort to cost set tuning, which has long been the norm for imbalanced classification. The solution is based on a lexicographic linear programming framework which requires two stages. Initially, class-specific component weight combinations are found so as to minimize a hinge loss individually for each of the classes. Subsequently, the final component weights are assigned so that the maximum deviation from the class-specific minimum loss values (obtained in the previous stage) is minimized. Hence, the proposal is not only restricted to two-class situations, but is also readily applicable to multi-class problems. Additionally, we also derive the dual formulation corresponding to the proposed framework. Experiments conducted on artificial and real-world imbalanced datasets as well as on challenging applications such as hyperspectral image classification and ImageNet classification establish the efficacy of the proposal.

[1]  Gunnar Rätsch,et al.  Soft Margins for AdaBoost , 2001, Machine Learning.

[2]  Gavin Brown,et al.  Calibrating AdaBoost for Asymmetric Learning , 2015, MCS.

[3]  Sang M. Lee,et al.  Goal programming for decision analysis , 1972 .

[4]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[5]  M. Maloof Learning When Data Sets are Imbalanced and When Costs are Unequal and Unknown , 2003 .

[6]  David J. Hand,et al.  A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems , 2001, Machine Learning.

[7]  Chen Huang,et al.  Learning Deep Representation for Imbalanced Classification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Dale Schuurmans,et al.  Boosting in the Limit: Maximizing the Margin of Learned Ensembles , 1998, AAAI/IAAI.

[9]  Paul A. Viola,et al.  Fast and Robust Classification using Asymmetric AdaBoost and a Detector Cascade , 2001, NIPS.

[10]  David Mease Cost-Weighted Boosting with Jittering and Over / Under-Sampling : JOUS-Boost , 2004 .

[11]  Xuebing Yang,et al.  AMDO: An Over-Sampling Technique for Multi-Class Imbalanced Problems , 2018, IEEE Transactions on Knowledge and Data Engineering.

[12]  Nuno Vasconcelos,et al.  Cost-Sensitive Boosting , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Yoav Freund,et al.  Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[14]  H. B. Mann,et al.  On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other , 1947 .

[15]  Fang Liu,et al.  Imbalanced Hyperspectral Image Classification Based on Maximum Margin , 2015, IEEE Geoscience and Remote Sensing Letters.

[16]  Lorenzo Bruzzone,et al.  Classification of Hyperspectral Images With Regularized Linear Discriminant Analysis , 2009, IEEE Transactions on Geoscience and Remote Sensing.

[17]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[18]  Nuno Vasconcelos,et al.  Asymmetric boosting , 2007, ICML '07.

[19]  Taghi M. Khoshgoftaar,et al.  RUSBoost: A Hybrid Approach to Alleviating Class Imbalance , 2010, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[20]  Kai Ming Ting,et al.  A Comparative Study of Cost-Sensitive Boosting Algorithms , 2000, ICML.

[21]  María José del Jesús,et al.  KEEL 3.0: An Open Source Software for Multi-Stage Analysis in Data Mining , 2017, Int. J. Comput. Intell. Syst..

[22]  Gaofeng Meng,et al.  Spectral Unmixing via Data-Guided Sparsity , 2014, IEEE Transactions on Image Processing.

[23]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Stephen J. Wright,et al.  Primal-Dual Interior-Point Methods , 1997 .

[25]  Marco Cococcioni,et al.  Lexicographic multi-objective linear programming using grossone methodology: Theory and algorithm , 2018, Appl. Math. Comput..

[26]  Jure Leskovec,et al.  Linear Programming Boosting for Uneven Datasets , 2003, ICML.

[27]  Shigeru Katagiri,et al.  Confusion-Matrix-Based Kernel Logistic Regression for Imbalanced Data Classification , 2017, IEEE Transactions on Knowledge and Data Engineering.

[28]  Bartosz Krawczyk,et al.  Learning from imbalanced data: open challenges and future directions , 2016, Progress in Artificial Intelligence.

[29]  Vipin Kumar,et al.  Evaluating boosting algorithms to classify rare classes: comparison and improvements , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[30]  Gunnar Rätsch,et al.  Constructing Boosting Algorithms from SVMs: An Application to One-Class Classification , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[31]  Yang Wang,et al.  Cost-sensitive boosting for classification of imbalanced data , 2007, Pattern Recognit..

[32]  Damminda Alahakoon,et al.  Minority report in fraud detection: classification of skewed data , 2004, SKDD.

[33]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[34]  Zhi-Hua Zhou,et al.  ON MULTI‐CLASS COST‐SENSITIVE LEARNING , 2006, Comput. Intell..

[35]  Herna L. Viktor,et al.  Learning from imbalanced data sets with boosting and data generation: the DataBoost-IM approach , 2004, SKDD.

[36]  R. Schapire The Strength of Weak Learnability , 1990, Machine Learning.

[37]  C. Romero Extended lexicographic goal programming: a unifying approach , 2001 .

[38]  N. Dopuch,et al.  Management Goals and Accounting for Control. , 1967 .

[39]  Jacek M. Zurada,et al.  Training neural network classifiers for medical decision making: The effects of imbalanced datasets on classification performance , 2008, Neural Networks.

[40]  M. Zarepisheh,et al.  A dual-based algorithm for solving lexicographic multiple objective programs , 2007, Eur. J. Oper. Res..

[41]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[42]  Chunhua Shen,et al.  On the Dual Formulation of Boosting Algorithms , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Xin Yao,et al.  Multiclass Imbalance Problems: Analysis and Potential Solutions , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[44]  M. Friedman The Use of Ranks to Avoid the Assumption of Normality Implicit in the Analysis of Variance , 1937 .

[45]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[46]  Joelle Pineau,et al.  Online Bagging and Boosting for Imbalanced Data Streams , 2013, IEEE Transactions on Knowledge and Data Engineering.

[47]  Bartosz Krawczyk Cost-sensitive one-vs-one ensemble for multi-class imbalanced data , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[48]  Nitesh V. Chawla,et al.  SMOTEBoost: Improving Prediction of the Minority Class in Boosting , 2003, PKDD.

[49]  D. A. Conway Management Goals and Accounting for Control , 1966 .

[50]  Roger M. Y. Ho,et al.  Goal programming and extensions , 1976 .

[51]  Stan Matwin,et al.  Addressing the Curse of Imbalanced Training Sets: One-Sided Selection , 1997, ICML.

[52]  Bidyut Baran Chaudhuri,et al.  Handling data irregularities in classification: Foundations, trends, and future challenges , 2018, Pattern Recognit..

[53]  Peter A. Flach,et al.  Cost-sensitive boosting algorithms: Do we really need them? , 2016, Machine Learning.

[54]  Haibo He,et al.  RAMOBoost: Ranked Minority Oversampling in Boosting , 2010, IEEE Transactions on Neural Networks.

[55]  J. K. Sankaran,et al.  On a variant of lexicographic multi-objective programming , 1998, Eur. J. Oper. Res..

[56]  Fernando Charro,et al.  A mixed problem for the infinity Laplacian via Tug-of-War games , 2007, 0706.4267.

[57]  José Luis Alba-Castro,et al.  Double-base asymmetric AdaBoost , 2013, Neurocomputing.

[58]  Francisco Herrera,et al.  A Review on Ensembles for the Class Imbalance Problem: Bagging-, Boosting-, and Hybrid-Based Approaches , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[59]  W. Kruskal,et al.  Use of Ranks in One-Criterion Variance Analysis , 1952 .

[60]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[61]  Y. Peres,et al.  Tug-of-war and the infinity Laplacian , 2006, math/0605002.