Dealing with Multiple Classes in Online Class Imbalance Learning

Online class imbalance learning deals with data streams having very skewed class distributions in a timely fashion. Although a few methods have been proposed to handle such problems, most of them focus on two-class cases. Multi-class imbalance imposes additional challenges in learning. This paper studies the combined challenges posed by multiclass imbalance and online learning, and aims at a more effective and adaptive solution. First, we introduce two resampling-based ensemble methods, called MOOB and MUOB, which can process multi-class data directly and strictly online with an adaptive sampling rate. Then, we look into the impact of multi-minority and multi-majority cases on MOOB and MUOB in comparison to other methods under stationary and dynamic scenarios. Both multi-minority and multi-majority make a negative impact. MOOB shows the best and most stable G-mean in most stationary and dynamic cases.

[1]  Zhiping Lin,et al.  Weighted Online Sequential Extreme Learning Machine for Class Imbalance Learning , 2013, Neural Processing Letters.

[2]  Sokol Ko,et al.  On multi-class classication through the minimization of the confusion matrix norm , 2013 .

[3]  Joelle Pineau,et al.  Online Ensemble Learning for Imbalanced Data Streams , 2013, ArXiv.

[4]  Rui Wang,et al.  Towards social user profiling: unified and discriminative influence model for inferring home locations , 2012, KDD.

[5]  Gary M. Weiss Mining with rarity: a unifying framework , 2004, SKDD.

[6]  Sokol Koço,et al.  On multi-class classification through the minimization of the confusion matrix norm , 2013, ACML.

[7]  Xin Yao,et al.  Resampling-Based Ensemble Methods for Online Class Imbalance Learning , 2015, IEEE Transactions on Knowledge and Data Engineering.

[8]  Xin Yao,et al.  The Impact of Diversity on Online Ensemble Learning in the Presence of Concept Drift , 2010, IEEE Transactions on Knowledge and Data Engineering.

[9]  Xin Yao,et al.  A learning framework for online class imbalance learning , 2013, 2013 IEEE Symposium on Computational Intelligence and Ensemble Learning (CIEL).

[10]  Xin Yao,et al.  Online Ensemble Learning of Data Streams with Gradually Evolved Classes , 2016, IEEE Transactions on Knowledge and Data Engineering.

[11]  Xin Yao,et al.  Multiclass Imbalance Problems: Analysis and Potential Solutions , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[12]  Hien M. Nguyen,et al.  Online learning from imbalanced data streams , 2011, 2011 International Conference of Soft Computing and Pattern Recognition (SoCPaR).

[13]  Nitesh V. Chawla,et al.  Learning from streaming data with concept drift and imbalance: an overview , 2012, Progress in Artificial Intelligence.

[14]  Koichiro Yamauchi,et al.  Detecting sudden concept drift with knowledge of human behavior , 2008, 2008 IEEE International Conference on Systems, Man and Cybernetics.

[15]  Rong Jin,et al.  Multi-Class Learning by Smoothed Boosting , 2007, Machine Learning.

[16]  Hadi Sadoghi Yazdi,et al.  Recursive least square perceptron model for non-stationary and imbalanced data stream classification , 2013, Evol. Syst..

[17]  Stuart J. Russell,et al.  Online bagging and boosting , 2005, 2005 IEEE International Conference on Systems, Man and Cybernetics.

[18]  Zhi-Hua Zhou,et al.  Learning Imbalanced Multi-class Data with Optimal Dichotomy Weights , 2013, 2013 IEEE 13th International Conference on Data Mining.

[19]  Nan Liu,et al.  Ensemble of subset online sequential extreme learning machine for class imbalance and concept drift , 2015, Neurocomputing.

[20]  Zhiping Lin,et al.  Voting based weighted online sequential extreme learning machine for imbalance multi-class classification , 2015, 2015 IEEE International Symposium on Circuits and Systems (ISCAS).

[21]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[22]  Taghi M. Khoshgoftaar,et al.  Experimental perspectives on learning from imbalanced data , 2007, ICML '07.

[23]  Indre Zliobaite,et al.  Combining similarity in time and space for training set formation under concept drift , 2011, Intell. Data Anal..

[24]  Vicenç Puig,et al.  Fault Diagnosis Using a Timed Discrete-Event Approach Based on Interval Observers: Application to Sewer Networks , 2010, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[25]  Nitesh V. Chawla,et al.  Learning in non-stationary environments with class imbalance , 2012, KDD.

[26]  Stan Matwin,et al.  Addressing the Curse of Imbalanced Training Sets: One-Sided Selection , 1997, ICML.

[27]  Nitesh V. Chawla,et al.  Building Decision Trees for the Multi-class Imbalance Problem , 2012, PAKDD.

[28]  Dazhe Zhao,et al.  A novel cost sensitive neural network ensemble for multiclass imbalance data learning , 2013, The 2013 International Joint Conference on Neural Networks (IJCNN).