A Review on Ensembles for the Class Imbalance Problem: Bagging-, Boosting-, and Hybrid-Based Approaches
暂无分享,去创建一个
Francisco Herrera | Humberto Bustince | Mikel Galar | Edurne Barrenechea Tartas | Alberto Fernández | F. Herrera | Alberto Fernández | H. Bustince | E. Tartas | M. Galar
[1] F. Wilcoxon. Individual Comparisons by Ranking Methods , 1945 .
[2] S. Holm. A Simple Sequentially Rejective Multiple Test Procedure , 1979 .
[3] J. Shaffer. Modified Sequentially Rejective Multiple Test Procedures , 1986 .
[4] J. Ross Quinlan,et al. C4.5: Programs for Machine Learning , 1992 .
[5] Elie Bienenstock,et al. Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.
[6] Sargur N. Srihari,et al. Decision Combination in Multiple Classifier Systems , 1994, IEEE Trans. Pattern Anal. Mach. Intell..
[7] Anders Krogh,et al. Neural Network Ensembles, Cross Validation, and Active Learning , 1994, NIPS.
[8] Thomas G. Dietterich,et al. Error-Correcting Output Coding Corrects Bias and Variance , 1995, ICML.
[9] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.
[10] Naonori Ueda,et al. Generalization error of ensemble estimators , 1996, Proceedings of International Conference on Neural Networks (ICNN'96).
[11] Robert Tibshirani,et al. Bias, Variance and Prediction Error for Classification Rules , 1996 .
[12] Leo Breiman,et al. Bias, Variance , And Arcing Classifiers , 1996 .
[13] Kagan Tumer,et al. Error Correlation and Error Reduction in Ensemble Classifiers , 1996, Connect. Sci..
[14] Ron Kohavi,et al. Bias Plus Variance Decomposition for Zero-One Loss Functions , 1996, ICML.
[15] Andrew P. Bradley,et al. The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..
[16] Jiri Matas,et al. On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..
[17] Salvatore J. Stolfo,et al. Toward Scalable Learning with Non-Uniform Class and Cost Distributions: A Case Study in Credit Card Fraud Detection , 1998, KDD.
[18] Yoram Singer,et al. Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.
[19] Salvatore J. Stolfo,et al. AdaCost: Misclassification Cost-Sensitive Boosting , 1999, ICML.
[20] Yiming Ma,et al. Improving an Association Rule Based Classifier , 2000, PKDD.
[21] Kai Ming Ting,et al. A Comparative Study of Cost-Sensitive Boosting Algorithms , 2000, ICML.
[22] J. Friedman. Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .
[23] D. Sheskin. Handbook of parametric and nonparametric statistical procedures, 2nd ed. , 2000 .
[24] Maliha S. Nash,et al. Handbook of Parametric and Nonparametric Statistical Procedures , 2001, Technometrics.
[25] Vipin Kumar,et al. Evaluating boosting algorithms to classify rare classes: comparison and improvements , 2001, Proceedings 2001 IEEE International Conference on Data Mining.
[26] Shigeo Abe DrEng. Pattern Classification , 2001, Springer London.
[27] Bianca Zadrozny,et al. Learning and making decisions when costs and probabilities are both unknown , 2001, KDD '01.
[28] Xiaohua Hu,et al. Using rough sets theory and database operations to construct a good ensemble of classifiers for data mining applications , 2001, Proceedings 2001 IEEE International Conference on Data Mining.
[29] Tin Kam Ho,et al. MULTIPLE CLASSIFIER COMBINATION: LESSONS AND NEXT STEPS , 2002 .
[30] Nitesh V. Chawla,et al. SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..
[31] R. Barandelaa,et al. Strategies for learning in class imbalance problems , 2003, Pattern Recognit..
[32] Nathalie Japkowicz,et al. The class imbalance problem: A systematic study , 2002, Intell. Data Anal..
[33] Rosa Maria Valdovinos,et al. New Applications of Ensembles of Classifiers , 2003, Pattern Analysis & Applications.
[34] Rong Yan,et al. On predicting rare classes with SVM ensembles in scene classification , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..
[35] Edward Y. Chang,et al. Statistical learning for effective visual information retrieval , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).
[36] Nitesh V. Chawla,et al. SMOTEBoost: Improving Prediction of the Minority Class in Boosting , 2003, PKDD.
[37] Foster J. Provost,et al. Learning When Training Data are Costly: The Effect of Class Distribution on Tree Induction , 2003, J. Artif. Intell. Res..
[38] Taeho Jo,et al. A Multiple Resampling Method for Learning from Imbalanced Data Sets , 2004, Comput. Intell..
[39] Jerome H. Friedman,et al. On Bias, Variance, 0/1—Loss, and the Curse-of-Dimensionality , 2004, Data Mining and Knowledge Discovery.
[40] Ludmila I. Kuncheva,et al. Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy , 2003, Machine Learning.
[41] Pedro M. Domingos,et al. Tree Induction for Probability-Based Ranking , 2003, Machine Learning.
[42] Gareth James,et al. Variance and Bias for General Loss Functions , 2003, Machine Learning.
[43] Ludmila I. Kuncheva,et al. Combining Pattern Classifiers: Methods and Algorithms , 2004 .
[44] Leo Breiman,et al. Pasting Small Votes for Classification in Large Databases and On-Line , 1999, Machine Learning.
[45] Stan Matwin,et al. Machine Learning for the Detection of Oil Spills in Satellite Radar Images , 1998, Machine Learning.
[46] Gustavo E. A. P. A. Batista,et al. A study of the behavior of several methods for balancing machine learning training data , 2004, SKDD.
[47] Yi Lin,et al. Support Vector Machines for Classification in Nonstandard Situations , 2002, Machine Learning.
[48] Herna L. Viktor,et al. Learning from imbalanced data sets with boosting and data generation: the DataBoost-IM approach , 2004, SKDD.
[49] Cynthia Rudin,et al. The Dynamics of AdaBoost: Cyclic Behavior and Convergence of Margins , 2004, J. Mach. Learn. Res..
[50] Leo Breiman,et al. Bagging Predictors , 1996, Machine Learning.
[51] R. Schapire. The Strength of Weak Learnability , 1990, Machine Learning.
[52] Edward Y. Chang,et al. KBA: kernel boundary alignment considering imbalanced data distribution , 2005, IEEE Transactions on Knowledge and Data Engineering.
[53] J. Ross Quinlan. Improved Estimates for the Accuracy of Small Disjuncts , 2005, Machine Learning.
[54] Xin Yao,et al. Diversity creation methods: a survey and categorisation , 2004, Inf. Fusion.
[55] Subhash C. Bagui,et al. Combining Pattern Classifiers: Methods and Algorithms , 2005, Technometrics.
[56] Charles X. Ling,et al. Using AUC and accuracy in evaluating learning algorithms , 2005, IEEE Transactions on Knowledge and Data Engineering.
[57] Ludmila I. Kuncheva. Diversity in multiple classifier systems , 2005, Inf. Fusion.
[58] Nitesh V. Chawla,et al. Data Mining for Imbalanced Datasets: An Overview , 2005, The Data Mining and Knowledge Discovery Handbook.
[59] Yi-Hung Liu,et al. Total margin based adaptive fuzzy support vector machines for multiview face recognition , 2005, 2005 IEEE International Conference on Systems, Man and Cybernetics.
[60] Qiang Yang,et al. Test strategies for cost-sensitive decision trees , 2006, IEEE Transactions on Knowledge and Data Engineering.
[61] R. Polikar,et al. Ensemble based systems in decision making , 2006, IEEE Circuits and Systems Magazine.
[62] Xuelong Li,et al. Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[63] Xindong Wu,et al. 10 Challenging Problems in Data Mining Research , 2006, Int. J. Inf. Technol. Decis. Mak..
[64] David A. Cieslak,et al. Combating imbalance in network intrusion datasets , 2006, 2006 IEEE International Conference on Granular Computing.
[65] Zhi-Hua Zhou,et al. Exploratory Under-Sampling for Class-Imbalance Learning , 2006, Sixth International Conference on Data Mining (ICDM'06).
[66] Janez Demsar,et al. Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..
[67] Hewijin Christine Jiau,et al. Evaluation of neural networks and data mining methods on a credit assessment task for class imbalance problem , 2006 .
[68] Kemal Kilic,et al. Comparison of Different Strategies of Utilizing Fuzzy Clustering in Structure Identification , 2007, Inf. Sci..
[69] Cen Li,et al. Classifying imbalanced data using a bagging ensemble variation (BEV) , 2007, ACM-SE 45.
[70] Pavel Brazdil,et al. Cost-Sensitive Decision Trees Applied to Medical Data , 2007, DaWaK.
[71] José Salvador Sánchez,et al. On the k-NN performance in a challenging scenario of imbalance and overlapping , 2008, Pattern Analysis and Applications.
[72] Chao-Ton Su,et al. An Evaluation of the Robustness of MTS for Imbalanced Data , 2007, IEEE Transactions on Knowledge and Data Engineering.
[73] Philip S. Yu,et al. Top 10 algorithms in data mining , 2007, Knowledge and Information Systems.
[74] Yang Wang,et al. Cost-sensitive boosting for classification of imbalanced data , 2007, Pattern Recognit..
[75] Randy H. Moss,et al. A methodological approach to the classification of dermoscopy images , 2007, Comput. Medical Imaging Graph..
[76] Xiang Peng,et al. Robust BMPM training based on second-order cone programming and its application in medical diagnosis , 2008, Neural Networks.
[77] S. García,et al. An Extension on "Statistical Comparisons of Classifiers over Multiple Data Sets" for all Pairwise Comparisons , 2008 .
[78] David A. Cieslak,et al. Learning Decision Trees for Unbalanced Data , 2008, ECML/PKDD.
[79] Wei-Zhen Lu,et al. Ground-level ozone prediction by support vector machine approach with a cost-sensitive classification scheme. , 2008, The Science of the total environment.
[80] Szymon Wilk,et al. Selective Pre-processing of Imbalanced Data for Improving Classification Performance , 2008, DaWaK.
[81] Hisashi Kashima,et al. Roughly balanced bagging for imbalanced data , 2009, Stat. Anal. Data Min..
[82] David A. Cieslak,et al. Automatically countering imbalance and its empirical relationship to cost , 2008, Data Mining and Knowledge Discovery.
[83] Shichao Zhang,et al. A Strategy for Attributes Selection in Cost-Sensitive Decision Trees Induction , 2008, 2008 IEEE 8th International Conference on Computer and Information Technology Workshops.
[84] Kagan Tumer,et al. Classifier ensembles: Select real-world applications , 2008, Inf. Fusion.
[85] David A. Cieslak,et al. Start Globally, Optimize Locally, Predict Globally: Improving Performance on Imbalanced Data , 2008, 2008 Eighth IEEE International Conference on Data Mining.
[86] María José del Jesús,et al. A study of the behaviour of linguistic fuzzy rule based classification systems in the framework of imbalanced data-sets , 2008, Fuzzy Sets Syst..
[87] Jacek M. Zurada,et al. Training neural network classifiers for medical decision making: The effects of imbalanced datasets on classification performance , 2008, Neural Networks.
[88] María José del Jesús,et al. KEEL: a software tool to assess evolutionary algorithms for data mining problems , 2008, Soft Comput..
[89] Francisco Herrera,et al. Enhancing the effectiveness and interpretability of decision tree and rule induction classifiers with evolutionary training set selection over imbalanced problems , 2009, Appl. Soft Comput..
[90] Lior Rokach,et al. Taxonomy for characterizing ensemble methods in classification tasks: A review and annotated bibliography , 2009, Comput. Stat. Data Anal..
[91] Lior Rokach,et al. Ensemble-based classifiers , 2010, Artificial Intelligence Review.
[92] Francisco Herrera,et al. A study of statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability , 2009, Soft Comput..
[93] David P. Williams,et al. Mine Classification With Imbalanced Data , 2009, IEEE Geoscience and Remote Sensing Letters.
[94] Q. Henry Wu,et al. Association Rule Mining-Based Dissolved Gas Analysis for Fault Diagnosis of Power Transformers , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
[95] Taghi M. Khoshgoftaar,et al. Evolutionary Sampling and Software Quality Modeling of High-Assurance Systems , 2009, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.
[96] Xin Yao,et al. Diversity analysis on imbalanced data sets by using ensemble models , 2009, 2009 IEEE Symposium on Computational Intelligence and Data Mining.
[97] Ester Bernadó-Mansilla,et al. Evolutionary rule-based systems for imbalanced data sets , 2008, Soft Comput..
[98] Ying He,et al. MSMOTE: Improving Classification Performance When Training Data is Imbalanced , 2009, 2009 Second International Workshop on Computer Science and Engineering.
[99] Andrew K. C. Wong,et al. Classification of Imbalanced Data: a Review , 2009, Int. J. Pattern Recognit. Artif. Intell..
[100] Taghi M. Khoshgoftaar,et al. An empirical comparison of repetitive undersampling techniques , 2009, 2009 IEEE International Conference on Information Reuse & Integration.
[101] Zhi-Bo Zhu,et al. Fault diagnosis based on imbalance modified kernel Fisher discriminant analysis , 2010 .
[102] Szymon Wilk,et al. Learning from Imbalanced Data in Presence of Noisy and Borderline Examples , 2010, RSCTC.
[103] Szymon Wilk,et al. Integrating Selective Pre-processing of Imbalanced Data with Ivotes Ensemble , 2010, RSCTC.
[104] Taghi M. Khoshgoftaar,et al. RUSBoost: A Hybrid Approach to Alleviating Class Imbalance , 2010, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.
[105] Francisco Herrera,et al. Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power , 2010, Inf. Sci..
[106] Bernardete Ribeiro,et al. Distributed Text Classification With an Ensemble Kernel-Based Learning Approach , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
[107] Ali A. Ghorbani,et al. IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS 1 Toward Credible Evaluation of Anomaly-Based Intrusion-Detection Methods , 2022 .
[108] Robert Sabourin,et al. Iterative Boolean combination of classifiers in the ROC space: An application to anomaly detection with HMMs , 2010, Pattern Recognit..
[109] Yun Yang,et al. Time Series Clustering Via RPCL Network Ensemble With Different Representations , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
[110] Hong Qiao,et al. An Efficient Tree Classifier Ensemble-Based Approach for Pedestrian Detection , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).
[111] Jesús Alcalá-Fdez,et al. KEEL Data-Mining Software Tool: Data Set Repository, Integration of Algorithms and Experimental Analysis Framework , 2011, J. Multiple Valued Log. Soft Comput..
[112] Jose Miguel Puerta,et al. Improving the performance of Naive Bayes multinomial in e-mail foldering by introducing distribution-based balance of datasets , 2011, Expert Syst. Appl..