Negative Correlation Ensemble Learning for Ordinal Regression

In this paper, two neural network threshold ensemble models are proposed for ordinal regression problems. For the first ensemble method, the thresholds are fixed a priori and are not modified during training. The second one considers the thresholds of each member of the ensemble as free parameters, allowing their modification during the training process. This is achieved through a reformulation of these tunable thresholds, which avoids the constraints they must fulfill for the ordinal regression problem. During training, diversity exists in different projections generated by each member is taken into account for the parameter updating. This diversity is promoted in an explicit way using a diversity-encouraging error function, extending the well-known negative correlation learning framework to the area of ordinal regression, and inheriting many of its good properties. Experimental results demonstrate that the proposed algorithms can achieve competitive generalization performance when considering four ordinal regression metrics.

[1]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[2]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[3]  Koby Crammer,et al.  Pranking with Ranking , 2001, NIPS.

[4]  Dan Roth,et al.  Learnability of Bipartite Ranking Functions , 2005, COLT.

[5]  R. Forthofer,et al.  Rank Correlation Methods , 1981 .

[6]  Ivor W. Tsang,et al.  Transductive Ordinal Regression , 2011, IEEE Transactions on Neural Networks and Learning Systems.

[7]  Wei Chu,et al.  New approaches to support vector ordinal regression , 2005, ICML.

[8]  Terry Windeatt,et al.  Accuracy/Diversity and Ensemble MLP Classifier Design , 2006, IEEE Transactions on Neural Networks.

[9]  Wei-Yin Loh,et al.  A Comparison of Prediction Accuracy, Complexity, and Training Time of Thirty-Three Old and New Classification Algorithms , 2000, Machine Learning.

[10]  Pedro Antonio Gutiérrez,et al.  Evolutionary q-Gaussian Radial Basis Function Neural Network to determine the microbial growth/no growth interface of Staphylococcus aureus , 2011, Appl. Soft Comput..

[11]  Wei Chu,et al.  Support Vector Ordinal Regression , 2007, Neural Computation.

[12]  Pedro Antonio Gutiérrez,et al.  Logistic Regression by Means of Evolutionary Radial Basis Function Neural Networks , 2011, IEEE Transactions on Neural Networks.

[13]  Ling Li,et al.  Ordinal Regression by Extended Binary Classification , 2006, NIPS.

[14]  Bernard De Baets,et al.  Learning partial ordinal class memberships with kernel-based proportional odds models , 2012, Comput. Stat. Data Anal..

[15]  Gerhard Widmer,et al.  Prediction of Ordinal Classes Using Regression Trees , 2001, Fundam. Informaticae.

[16]  Huanhuan Chen,et al.  Regularized Negative Correlation Learning for Neural Network Ensembles , 2009, IEEE Transactions on Neural Networks.

[17]  Liangxiao Jiang,et al.  Augmenting naive Bayes for ranking , 2005, ICML.

[18]  Andrea Esuli,et al.  Evaluation Measures for Ordinal Regression , 2009, 2009 Ninth International Conference on Intelligent Systems Design and Applications.

[19]  Robert J. Arens Learning SVM Ranking Functions from User Feedback Using Document Metadata and Active Learning in the Biomedical Domain , 2010, Preference Learning.

[20]  M. Friedman A Comparison of Alternative Tests of Significance for the Problem of $m$ Rankings , 1940 .

[21]  Xin Yao,et al.  Ensemble learning via negative correlation , 1999, Neural Networks.

[22]  Ralf Herbrich,et al.  Large margin rank boundaries for ordinal regression , 2000 .

[23]  Ling Li,et al.  Large-Margin Thresholded Ensembles for Ordinal Regression: Theory and Practice , 2006, ALT.

[24]  Amnon Shashua,et al.  Ranking with Large Margin Principle: Two Approaches , 2002, NIPS.

[25]  Xin Yao,et al.  Diversity creation methods: a survey and categorisation , 2004, Inf. Fusion.

[26]  David W. Aha,et al.  Instance‐based prediction of real‐valued attributes , 1989, Comput. Intell..

[27]  Wei Chu,et al.  Preference learning with Gaussian processes , 2005, ICML.

[28]  Kagan Tumer,et al.  Analysis of decision boundaries in linearly combined neural classifiers , 1996, Pattern Recognit..

[29]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[30]  Wei Chu,et al.  Gaussian Processes for Ordinal Regression , 2005, J. Mach. Learn. Res..

[31]  Eibe Frank,et al.  A Simple Approach to Ordinal Classification , 2001, ECML.

[32]  Xin Yao,et al.  Ieee Transactions on Knowledge and Data Engineering 1 Relationships between Diversity of Classification Ensembles and Single-class Performance Measures , 2022 .

[33]  Xin Yao,et al.  Multiclass Imbalance Problems: Analysis and Potential Solutions , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[34]  P. McCullagh Regression Models for Ordinal Data , 1980 .

[35]  Kagan Tumer,et al.  Error Correlation and Error Reduction in Ensemble Classifiers , 1996, Connect. Sci..

[36]  Willem Waegeman,et al.  An ensemble of Weighted Support Vector Machines for Ordinal Regression , 2007 .

[37]  David J. C. MacKay,et al.  Bayesian Methods for Backpropagation Networks , 1996 .

[38]  Xin Yao,et al.  Ensemble Learning Using Multi-Objective Evolutionary Algorithms , 2006, J. Math. Model. Algorithms.

[39]  Koby Crammer,et al.  Online Ranking by Projecting , 2005, Neural Computation.

[40]  Xin Yao,et al.  Evolving hybrid ensembles of learning machines for better generalisation , 2006, Neurocomputing.

[41]  Tom M. Mitchell,et al.  Using the Future to Sort Out the Present: Rankprop and Multitask Learning for Medical Risk Evaluation , 1995, NIPS.

[42]  Hsuan-Tien Lin,et al.  Combining Ordinal Preferences by Boosting , 2009 .

[43]  Thore Graepel,et al.  Large Margin Rank Boundaries for Ordinal Regression , 2000 .

[44]  Aníbal R. Figueiras-Vidal,et al.  Feature Combiners With Gate-Generated Weights for Classification , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[45]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[46]  Pedro Antonio Gutiérrez,et al.  MELM-GRBF: A modified version of the extreme learning machine for generalized radial basis function neural networks , 2011, Neurocomputing.

[47]  Pedro Antonio Gutiérrez,et al.  Evolutionary q-Gaussian radial basis function neural networks for multiclassification , 2011, Neural Networks.

[48]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[49]  Kyoung-jae Kim,et al.  A corporate credit rating model using multi-class support vector machines with an ordinal pairwise partitioning approach , 2012, Comput. Oper. Res..

[50]  Nikola K. Kasabov,et al.  Fast neural network ensemble learning via negative-correlation data correction , 2005, IEEE Transactions on Neural Networks.

[51]  Jaime S. Cardoso,et al.  Learning to Classify Ordinal Data: The Data Replication Method , 2007, J. Mach. Learn. Res..

[52]  Klaus Obermayer,et al.  Support vector learning for ordinal regression , 1999 .