Trade-Off Between Diversity and Accuracy in Ensemble Generation

Summary. Ensembles of learning machines have been formally and empirically shown to outperform (generalise better than) single learners in many cases. Evidence suggests that ensembles generalise better when they constitute members which form a diverse and accurate set. Diversity and accuracy are hence two factors that should be taken care of while designing ensembles in order for them to generalise better. There exists a trade-off between diversity and accuracy. Multi-objective evolutionary algorithms can be employed to tackle this issue to good effect. This chapter includes a brief overview of ensemble learning in general and presents a critique on the utility of multi-objective evolutionary algorithms for their design. Theoretical aspects of a committee of learners viz. the bias-variance-covariance decomposition and ambiguity decomposition are further discussed in order to support the importance of having both diversity and accuracy in ensembles. Some recent work and experimental results, considering classification tasks in particular, based on multi-objective learning of ensembles are then presented as we examine ensemble formation using neural networks and kernel machines.

[1]  Giorgio Valentini,et al.  Ensembles of Learning Machines , 2002, WIRN.

[2]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[3]  M. Anastasio,et al.  Multiobjective genetic optimization of diagnostic classifiers with implications for generating receiver operating characteristic curves , 1999, IEEE Transactions on Medical Imaging.

[4]  Ricardo H. C. Takahashi,et al.  Improving generalization of MLPs with multi-objective optimization , 2000, Neurocomputing.

[5]  Naonori Ueda,et al.  Generalization error of ensemble estimators , 1996, Proceedings of International Conference on Neural Networks (ICNN'96).

[6]  Giorgio Valentini,et al.  Bias-Variance Analysis of Support Vector Machines for the Development of SVM-Based Ensemble Methods , 2004, J. Mach. Learn. Res..

[7]  Xin Yao,et al.  Ensemble learning via negative correlation , 1999, Neural Networks.

[8]  Xin Yao,et al.  Evolutionary framework for the construction of diverse hybrid ensembles , 2005, ESANN.

[9]  Derek Partridge,et al.  Diversity between Neural Networks and Decision Trees for Building Multiple Classifier Systems , 2000, Multiple Classifier Systems.

[10]  Ida G. Sprinkhuizen-Kuyper,et al.  Evolving Artificial Neural Networks using the "Baldwin Effect" † , 1995 .

[11]  Amanda J. C. Sharkey,et al.  Combining Artificial Neural Nets: Ensemble and Modular Multi-Net Systems , 1999 .

[12]  William B. Yates,et al.  Use of methodological diversity to improve neural network generalisation , 2005, Neural Computing & Applications.

[13]  Kagan Tumer,et al.  Analysis of decision boundaries in linearly combined neural classifiers , 1996, Pattern Recognit..

[14]  Hussein A. Abbass Pareto Neuro-Ensembles , 2003, Australian Conference on Artificial Intelligence.

[15]  Ronald L. Rivest,et al.  Training a 3-node neural network is NP-complete , 1988, COLT '88.

[16]  Xin Yao,et al.  Evolving Neural Network Ensembles by Minimization of Mutual Information , 2004, Int. J. Hybrid Intell. Syst..

[17]  Xin Yao,et al.  Diversity creation methods: a survey and categorisation , 2004, Inf. Fusion.

[18]  Pedro M. Domingos A Unifeid Bias-Variance Decomposition and its Applications , 2000, ICML.

[19]  Kevin W. Bowyer,et al.  Combination of Multiple Classifiers Using Local Accuracy Estimates , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  Gavin Brown,et al.  Diversity in neural network ensembles , 2004 .

[21]  Xin Yao,et al.  Learning and Evolution by Minimization of Mutual Information , 2002, PPSN.

[22]  H. Abbass,et al.  PDE: a Pareto-frontier differential evolution approach for multi-objective optimization problems , 2001, Proceedings of the 2001 Congress on Evolutionary Computation (IEEE Cat. No.01TH8546).

[23]  Gavin Brown,et al.  The Use of the Ambiguity Decomposition in Neural Network Ensemble Learning Methods , 2003, ICML.

[24]  Sargur N. Srihari,et al.  Decision Combination in Multiple Classifier Systems , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[25]  Bernhard Sendhoff,et al.  Neural network regularization and ensembling using multi-objective evolutionary algorithms , 2004, Proceedings of the 2004 Congress on Evolutionary Computation (IEEE Cat. No.04TH8753).

[26]  David W. Opitz,et al.  Generating Accurate and Diverse Members of a Neural-Network Ensemble , 1995, NIPS.

[27]  Xin Yao,et al.  Every Niching Method has its Niche: Fitness Sharing and Implicit Sharing Compared , 1996, PPSN.

[28]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[29]  Alan S. Perelson,et al.  Using Genetic Algorithms to Explore Pattern Recognition in the Immune System , 1993, Evolutionary Computation.

[30]  Hussein A. Abbass,et al.  Pareto neuro-evolution: constructing ensemble of neural networks using multi-objective optimization , 2003, The 2003 Congress on Evolutionary Computation, 2003. CEC '03..

[31]  Gunnar Rätsch,et al.  An Introduction to Boosting and Leveraging , 2002, Machine Learning Summer School.

[32]  Xin Yao,et al.  Ensemble Learning Using Multi-Objective Evolutionary Algorithms , 2006, J. Math. Model. Algorithms.

[33]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[34]  Yoav Freund,et al.  A Short Introduction to Boosting , 1999 .

[35]  Harry Wechsler,et al.  Face and hand gesture recognition using hybrid classifiers , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[36]  Hussein A. Abbass,et al.  A Memetic Pareto Evolutionary Approach to Artificial Neural Networks , 2001, Australian Joint Conference on Artificial Intelligence.

[37]  Xin Yao,et al.  Evolving artificial neural networks , 1999, Proc. IEEE.

[38]  Tom Heskes,et al.  Bias/Variance Decompositions for Likelihood-Based Estimators , 1998, Neural Computation.

[39]  Thomas G. Dietterich Machine-Learning Research Four Current Directions , 1997 .

[40]  Jeffrey Horn,et al.  Multiobjective Optimization Using the Niched Pareto Genetic Algorithm , 1993 .

[41]  Eric Bauer,et al.  An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants , 1999, Machine Learning.

[42]  Xin Yao,et al.  Evolutionary ensembles with negative correlation learning , 2000, IEEE Trans. Evol. Comput..

[43]  Hussein A. Abbass,et al.  Speeding Up Backpropagation Using Multiobjective Evolutionary Algorithms , 2003, Neural Computation.

[44]  Xin Yao,et al.  DIVACE: Diverse and Accurate Ensemble Learning Algorithm , 2004, IDEAL.

[45]  Noel E. Sharkey,et al.  Combining diverse neural nets , 1997, The Knowledge Engineering Review.

[46]  Tao Xiong,et al.  A combined SVM and LDA approach for classification , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[47]  Jerome H. Friedman,et al.  On Bias, Variance, 0/1—Loss, and the Curse-of-Dimensionality , 2004, Data Mining and Knowledge Discovery.

[48]  Bernhard Sendhoff,et al.  EVOLUTIONARY MULTI-OBJECTIVE OPTIMIZATION APPROACH TO CONSTRUCTING NEURAL NETWORK ENSEMBLES FOR REGRESSION , 2004 .

[49]  William B. Langdon,et al.  Combining Decision Trees and Neural Networks for Drug Discovery , 2002, EuroGP.

[50]  Xin Yao,et al.  Evolving hybrid ensembles of learning machines for better generalisation , 2006, Neurocomputing.

[51]  David J. Spiegelhalter,et al.  Machine Learning, Neural and Statistical Classification , 2009 .

[52]  Thomas G. Dietterich,et al.  Error-Correcting Output Codes: A General Method for Improving Multiclass Inductive Learning Programs , 1991, AAAI.

[53]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[54]  Pedro M. Domingos A Unifeid Bias-Variance Decomposition and its Applications , 2000, ICML.

[55]  Zbigniew Michalewicz,et al.  Evolutionary Computation 1 , 2018 .

[56]  Xin Yao,et al.  Using Negative Correlation to Evolve Fault-Tolerant Circuits , 2003, ICES.

[57]  Risto Miikkulainen,et al.  Evolving Neural Networks through Augmenting Topologies , 2002, Evolutionary Computation.

[58]  Derek Partridge,et al.  Hybrid ensembles and coincident-failure diversity , 2001, IJCNN'01. International Joint Conference on Neural Networks. Proceedings (Cat. No.01CH37222).

[59]  C. Metz Basic principles of ROC analysis. , 1978, Seminars in nuclear medicine.

[60]  Yianni Attikiouzel,et al.  A novel multicriteria optimization algorithm for the structure determination of multilayer feedforward neural networks , 1996 .

[61]  Bev Littlewood,et al.  Conceptual Modeling of Coincident Failures in Multiversion Software , 1989, IEEE Trans. Software Eng..

[62]  Anders Krogh,et al.  Neural Network Ensembles, Cross Validation, and Active Learning , 1994, NIPS.

[63]  D. Opitz,et al.  Popular Ensemble Methods: An Empirical Study , 1999, J. Artif. Intell. Res..

[64]  Thomas G. Dietterich,et al.  Error-Correcting Output Coding Corrects Bias and Variance , 1995, ICML.

[65]  Lars Kai Hansen,et al.  Neural Network Ensembles , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[66]  L. Breiman USING ADAPTIVE BAGGING TO DEBIAS REGRESSIONS , 1999 .

[67]  Geoffrey I. Webb,et al.  Proceedings of the 17th Australian Joint Conference on Artificial Intelligence , 2004 .