Lambda-gamma learning with feedforward neural networks using particle swarm optimization

The sigmoid function is a widely used, bounded activation function for feedforward neural networks (FFNNs). A problem with using bounded activation functions is that it necessitates scaling of the data to suit the fixed domain and range of the function. Alternatively the activation function itself can be adapted by learning the gradient and range of the function alongside the FFNN weights. The purpose of this paper is to investigate whether the particle swarm optimization (PSO) algorithm is capable of training FFNNs that use adaptive sigmoid activation functions. The PSO algorithm is also compared against the gradient based lambda-gamma backpropagation learning algorithm (LG-BP) on five classification and five regression data sets. Experiments are conducted with scaled and unscaled input data as well as target output ranges of increasing size. The PSO algorithm proves capable of training adaptive activation function FFNNs and significantly outperforms the LG-BP algorithm on all problems. With the PSO, the use of adaptive activation functions improves the training accuracy of the FFNN, but leads to worse generalization performance due to overfitting. Increasing the size of the target output range increases the overfitting and worsens the generalization performance. Less overfitting is witnessed on data sets with unscaled input data.

[1]  R. W. Dobbins,et al.  Computational intelligence PC tools , 1996 .

[2]  Etienne Barnard,et al.  Avoiding false local minima by proper initialization of connections , 1992, IEEE Trans. Neural Networks.

[3]  J. Kennedy,et al.  Population structure and particle swarm performance , 2002, Proceedings of the 2002 Congress on Evolutionary Computation. CEC'02 (Cat. No.02TH8600).

[4]  Andries Petrus Engelbrecht,et al.  Global optimization algorithms for training product unit neural networks , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[5]  James Kennedy,et al.  The particle swarm: social adaptation of knowledge , 1997, Proceedings of 1997 IEEE International Conference on Evolutionary Computation (ICEC '97).

[6]  Russell C. Eberhart,et al.  A new optimizer using particle swarm theory , 1995, MHS'95. Proceedings of the Sixth International Symposium on Micro Machine and Human Science.

[7]  John Fulcher,et al.  Computational Intelligence: An Introduction , 2008, Computational Intelligence: A Compendium.

[8]  Yuval Davidor,et al.  Epistasis Variance: A Viewpoint on GA-Hardness , 1990, FOGA.

[9]  Frans van den Bergh,et al.  An analysis of particle swarm optimizers , 2002 .

[10]  Paulo Cortez,et al.  Particle swarms for feedforward neural network training , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).

[11]  R. Eberhart,et al.  Comparing inertia weights and constriction factors in particle swarm optimization , 2000, Proceedings of the 2000 Congress on Evolutionary Computation. CEC00 (Cat. No.00TH8512).

[12]  Andries Petrus Engelbrecht,et al.  Overfitting by PSO trained feedforward neural networks , 2010, IEEE Congress on Evolutionary Computation.

[13]  R. Salomon Re-evaluating genetic algorithm performance under coordinate rotation of benchmark functions. A survey of some theoretical and practical aspects of genetic algorithms. , 1996, Bio Systems.

[14]  John C. Platt A Resource-Allocating Network for Function Interpolation , 1991, Neural Computation.

[15]  Jacek M. Zurada Lambda learning rule for feedforward neural networks , 1993, IEEE International Conference on Neural Networks.

[16]  Andries Petrus Engelbrecht,et al.  Automatic Scaling using Gamma Learning for Feedforward Neural Networks , 1995, IWANN.

[17]  Michael Y. Hu,et al.  Effect of data standardization on neural network training , 1996 .

[18]  Andries Petrus Engelbrecht,et al.  Comparing PSO structures to learn the game of checkers from zero knowledge , 2003, The 2003 Congress on Evolutionary Computation, 2003. CEC '03..

[19]  Nelis Franken,et al.  Visual exploration of algorithm parameter space , 2009, 2009 IEEE Congress on Evolutionary Computation.

[20]  Yue Shi,et al.  A modified particle swarm optimizer , 1998, 1998 IEEE International Conference on Evolutionary Computation Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98TH8360).

[21]  Klaus-Robert Müller,et al.  Efficient BackProp , 2012, Neural Networks: Tricks of the Trade.