Overfitting by PSO trained feedforward neural networks

The purpose of this paper is to investigate the overfitting behavior of particle swarm optimization (PSO) trained neural networks. Neural networks trained with PSOs using the global best, local best and Von Neumann information sharing topologies are investigated. Experiments are conducted on five classification and five time series regression problems. It is shown that differences exist in the degree of overfitting between the different topologies. Additionally, non-convergence of the swarms is witnessed, which is hypothetically attributed to the use of a bounded activation function in the neural networks. The hypothesis is supported by experiments conducted using an unbounded activation function in the neural network hidden layer, which lead to convergent swarms. Additionally this also lead to drastically reduced overfitting by the neural networks.

[1]  F. van den Bergh,et al.  Training product unit networks using cooperative particle swarm optimisers , 2001, IJCNN'01. International Joint Conference on Neural Networks. Proceedings (Cat. No.01CH37222).

[2]  James Kennedy,et al.  The particle swarm: social adaptation of knowledge , 1997, Proceedings of 1997 IEEE International Conference on Evolutionary Computation (ICEC '97).

[3]  Maurice Clerc,et al.  The particle swarm - explosion, stability, and convergence in a multidimensional complex space , 2002, IEEE Trans. Evol. Comput..

[4]  Andries Petrus Engelbrecht,et al.  Cooperative learning in neural networks using particle swarm optimizers , 2000, South Afr. Comput. J..

[5]  Andries Petrus Engelbrecht,et al.  Comparing PSO structures to learn the game of checkers from zero knowledge , 2003, The 2003 Congress on Evolutionary Computation, 2003. CEC '03..

[6]  T. Krink,et al.  Particle swarm optimisation with spatial particle extension , 2002, Proceedings of the 2002 Congress on Evolutionary Computation. CEC'02 (Cat. No.02TH8600).

[7]  Frans van den Bergh,et al.  Particle Swarm Weight Initialization In Multi-Layer Perceptron Artificial Neural Networks , 1999 .

[8]  R. Eberhart,et al.  Comparing inertia weights and constriction factors in particle swarm optimization , 2000, Proceedings of the 2000 Congress on Evolutionary Computation. CEC00 (Cat. No.00TH8512).

[9]  R. W. Dobbins,et al.  Computational intelligence PC tools , 1996 .

[10]  Frans van den Bergh,et al.  An analysis of particle swarm optimizers , 2002 .

[11]  Andries P. Engelbrecht,et al.  Computational Intelligence: An Introduction , 2002 .

[12]  Paulo Cortez,et al.  Particle swarms for feedforward neural network training , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).

[13]  John C. Platt A Resource-Allocating Network for Function Interpolation , 1991, Neural Computation.

[14]  Andries Petrus Engelbrecht,et al.  Global optimization algorithms for training product unit neural networks , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[15]  J. Kennedy,et al.  Population structure and particle swarm performance , 2002, Proceedings of the 2002 Congress on Evolutionary Computation. CEC'02 (Cat. No.02TH8600).

[16]  Russell C. Eberhart,et al.  A new optimizer using particle swarm theory , 1995, MHS'95. Proceedings of the Sixth International Symposium on Micro Machine and Human Science.

[17]  Riccardo Poli,et al.  New ideas in optimization , 1999 .

[18]  Yue Shi,et al.  A modified particle swarm optimizer , 1998, 1998 IEEE International Conference on Evolutionary Computation Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98TH8360).

[19]  Andries Petrus Engelbrecht,et al.  Measuring exploration/exploitation in particle swarms using swarm diversity , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).

[20]  A. Engelbrecht,et al.  A new locally convergent particle swarm optimiser , 2002, IEEE International Conference on Systems, Man and Cybernetics.

[21]  Axel Röbel Dynamic Pattern Selection: Effectively training Backpropagation Neural Networks , 1994 .