Multiobjective Neural Network Ensembles Based on Regularized Negative Correlation Learning

Negative Correlation Learning (NCL) [CHECK END OF SENTENCE], [CHECK END OF SENTENCE] is a neural network ensemble learning algorithm which introduces a correlation penalty term to the cost function of each individual network so that each neural network minimizes its mean-square-error (MSE) together with the correlation. This paper describes NCL in detail and observes that the NCL corresponds to training the entire ensemble as a single learning machine that only minimizes the MSE without regularization. This insight explains that NCL is prone to overfitting the noise in the training set. The paper analyzes this problem and proposes the multiobjective regularized negative correlation learning (MRNCL) algorithm which incorporates an additional regularization term for the ensemble and uses the evolutionary multiobjective algorithm to design ensembles. In MRNCL, we define the crossover and mutation operators and adopt nondominated sorting algorithm with fitness sharing and rank-based fitness assignment. The experiments on synthetic data as well as real-world data sets demonstrate that MRNCL achieves better performance than NCL, especially when the noise level is nontrivial in the data set. In the experimental discussion, we give three reasons why our algorithm outperforms others.

[1]  Xin Yao,et al.  Simultaneous training of negatively correlated neural networks in an ensemble , 1999, IEEE Trans. Syst. Man Cybern. Part B.

[2]  Xin Yao,et al.  A constructive algorithm for training cooperative neural network ensembles , 2003, IEEE Trans. Neural Networks.

[3]  Xin Yao,et al.  Ensemble Learning Using Multi-Objective Evolutionary Algorithms , 2006, J. Math. Model. Algorithms.

[4]  César Hervás-Martínez,et al.  Cooperative coevolution of artificial neural network ensembles for pattern classification , 2005, IEEE Transactions on Evolutionary Computation.

[5]  Xin Yao,et al.  Ensemble learning via negative correlation , 1999, Neural Networks.

[6]  Christian Igel,et al.  Empirical evaluation of the improved Rprop learning algorithms , 2003, Neurocomputing.

[7]  Robert E. Schapire,et al.  A Brief Introduction to Boosting , 1999, IJCAI.

[8]  Kalyanmoy Deb,et al.  MULTI-OBJECTIVE FUNCTION OPTIMIZATION USING NON-DOMINATED SORTING GENETIC ALGORITHMS , 1994 .

[9]  Marco Laumanns,et al.  Performance assessment of multiobjective optimizers: an analysis and review , 2003, IEEE Trans. Evol. Comput..

[10]  Kalyanmoy Deb,et al.  Muiltiobjective Optimization Using Nondominated Sorting in Genetic Algorithms , 1994, Evolutionary Computation.

[11]  Hussein A. Abbass,et al.  A Memetic Pareto Evolutionary Approach to Artificial Neural Networks , 2001, Australian Joint Conference on Artificial Intelligence.

[12]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[13]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[14]  Bernhard Sendhoff,et al.  Neural network regularization and ensembling using multi-objective evolutionary algorithms , 2004, Proceedings of the 2004 Congress on Evolutionary Computation (IEEE Cat. No.04TH8753).

[15]  Peter Tiño,et al.  Managing Diversity in Regression Ensembles , 2005, J. Mach. Learn. Res..

[16]  M. Friedman The Use of Ranks to Avoid the Assumption of Normality Implicit in the Analysis of Variance , 1937 .

[17]  O. J. Dunn Multiple Comparisons among Means , 1961 .

[18]  Anders Krogh,et al.  Neural Network Ensembles, Cross Validation, and Active Learning , 1994, NIPS.

[19]  Xin Yao,et al.  Neural-Based Learning Classifier Systems , 2008, IEEE Transactions on Knowledge and Data Engineering.

[20]  Xin Yao,et al.  Every Niching Method has its Niche: Fitness Sharing and Implicit Sharing Compared , 1996, PPSN.

[21]  Yong Zhao,et al.  All Zero Block Detection Based on Statistics for AVS-M Intra Frame Prediction , 2008, 2008 International Symposium on Intelligent Information Technology Application Workshops.

[22]  Huanhuan Chen,et al.  Evolutionary random neural ensembles based on negative correlation learning , 2007, 2007 IEEE Congress on Evolutionary Computation.

[23]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[24]  Anders Krogh,et al.  A Simple Weight Decay Can Improve Generalization , 1991, NIPS.

[25]  Hussein A. Abbass,et al.  Anticorrelation: A Diversity Promoting Mechanisms in Ensemble Learning , 2001 .

[26]  Robert B. Gramacy,et al.  Gaussian processes and limiting linear models , 2008, Comput. Stat. Data Anal..

[27]  Kalyanmoy Deb,et al.  A Fast Elitist Non-dominated Sorting Genetic Algorithm for Multi-objective Optimisation: NSGA-II , 2000, PPSN.

[28]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[29]  Huanhuan Chen,et al.  Predictive Ensemble Pruning by Expectation Propagation , 2009, IEEE Transactions on Knowledge and Data Engineering.

[30]  Kalyanmoy Deb,et al.  RapidAccurate Optimization of Difficult Problems Using Fast Messy Genetic Algorithms , 1993, ICGA.

[31]  Luiz Eduardo Soares de Oliveira,et al.  Multi-objective Genetic Algorithms to Create Ensemble of Classifiers , 2005, EMO.

[32]  R. Iman,et al.  Approximations of the critical region of the fbietkan statistic , 1980 .

[33]  Oleksandr Makeyev,et al.  Neural network with ensembles , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[34]  Xin Yao,et al.  Making use of population information in evolutionary artificial neural networks , 1998, IEEE Trans. Syst. Man Cybern. Part B.

[35]  Yoshua Bengio,et al.  Pattern Recognition and Neural Networks , 1995 .

[36]  Xin Yao,et al.  Evolving hybrid ensembles of learning machines for better generalisation , 2006, Neurocomputing.

[37]  Hussein A. Abbass,et al.  Speeding Up Backpropagation Using Multiobjective Evolutionary Algorithms , 2003, Neural Computation.

[38]  Xin Yao,et al.  Evolutionary ensembles with negative correlation learning , 2000, IEEE Trans. Evol. Comput..

[39]  G. Yule,et al.  On the association of attributes in statistics, with examples from the material of the childhood society, &c , 1900, Proceedings of the Royal Society of London.

[40]  Manfred M. Fischer,et al.  Neural network ensembles and their application to traffic flow prediction in telecommunications networks , 2001, IJCNN'01. International Joint Conference on Neural Networks. Proceedings (Cat. No.01CH37222).

[41]  Christian Igel,et al.  Improving the Rprop Learning Algorithm , 2000 .

[42]  Huanhuan Chen,et al.  Regularized Negative Correlation Learning for Neural Network Ensembles , 2009, IEEE Transactions on Neural Networks.

[43]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[44]  James E. Baker,et al.  Adaptive Selection Methods for Genetic Algorithms , 1985, International Conference on Genetic Algorithms.