A New Adaptive Merging and Growing Algorithm for Designing Artificial Neural Networks

This paper presents a new algorithm, called adaptive merging and growing algorithm (AMGA), in designing artificial neural networks (ANNs). This algorithm merges and adds hidden neurons during the training process of ANNs. The merge operation introduced in AMGA is a kind of a mixed mode operation, which is equivalent to pruning two neurons and adding one neuron. Unlike most previous studies, AMGA puts emphasis on autonomous functioning in the design process of ANNs. This is the main reason why AMGA uses an adaptive not a predefined fixed strategy in designing ANNs. The adaptive strategy merges or adds hidden neurons based on the learning ability of hidden neurons or the training progress of ANNs. In order to reduce the amount of retraining after modifying ANN architectures, AMGA prunes hidden neurons by merging correlated hidden neurons and adds hidden neurons by splitting existing hidden neurons. The proposed AMGA has been tested on a number of benchmark problems in machine learning and ANNs, including breast cancer, Australian credit card assessment, and diabetes, gene, glass, heart, iris, and thyroid problems. The experimental results show that AMGA can design compact ANN architectures with good generalization ability compared to other algorithms.

[1]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[2]  Ron Meir,et al.  Evolving a learning algorithm for the binary perceptron , 1991 .

[3]  Kurt Hornik,et al.  Some new results on neural network approximation , 1993, Neural Networks.

[4]  John R. Koza,et al.  Genetic generation of both the weights and architecture for a neural network , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.

[5]  Oded Maimon,et al.  Design architectures and training of neural networks with a distributed genetic algorithm , 1993, IEEE International Conference on Neural Networks.

[6]  Mikko Lehtokangas,et al.  Modified cascade-correlation learning for classification , 2000, IEEE Trans. Neural Networks Learn. Syst..

[7]  Md. Monirul Islam,et al.  Graph Matching Recombination for Evolving Neural Networks , 2007, ISNN.

[8]  Alfred Jean Philippe Lauret,et al.  A node pruning algorithm based on a Fourier amplitude sensitivity test method , 2006, IEEE Transactions on Neural Networks.

[9]  Lutz Prechelt,et al.  Some notes on neural learning algorithm benchmarking , 1995, Neurocomputing.

[10]  Neil Burgess,et al.  A Constructive Algorithm that Converges for Real-Valued Input Patterns , 1994, Int. J. Neural Syst..

[11]  James T. Kwok,et al.  Objective functions for training new hidden units in constructive neural networks , 1997, IEEE Trans. Neural Networks.

[12]  Gustavo Deco,et al.  Two Strategies to Avoid Overfitting in Feedforward Networks , 1997, Neural Networks.

[13]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[14]  Babak Hassibi,et al.  Second Order Derivatives for Network Pruning: Optimal Brain Surgeon , 1992, NIPS.

[15]  Jooyoung Park,et al.  Universal Approximation Using Radial-Basis-Function Networks , 1991, Neural Computation.

[16]  Darrell Whitley,et al.  Optimizing small neural networks using a distributed genetic algorithm , 1990 .

[17]  Lawrence J. Fogel,et al.  Intelligence Through Simulated Evolution: Forty Years of Evolutionary Programming , 1999 .

[18]  John Moody,et al.  Prediction Risk and Architecture Selection for Neural Networks , 1994 .

[19]  David J. C. MacKay,et al.  Bayesian Interpolation , 1992, Neural Computation.

[20]  Hans-Paul Schwefel,et al.  Numerical optimization of computer models , 1981 .

[21]  Jean-Pierre Nadal,et al.  Study of a Growth Algorithm for a Feedforward Network , 1989, Int. J. Neural Syst..

[22]  David B. Fogel,et al.  Evolutionary Computation: Towards a New Philosophy of Machine Intelligence , 1995 .

[23]  Kazuyuki Murase,et al.  A New Crossover Operator and Its Application to Artificial Neural Networks Evolution , 2001 .

[24]  James T. Kwok,et al.  Constructive algorithms for structure learning in feedforward neural networks for regression problems , 1997, IEEE Trans. Neural Networks.

[25]  Kurt Hornik,et al.  Approximation capabilities of multilayer feedforward networks , 1991, Neural Networks.

[26]  Rudy Setiono,et al.  Use of a quasi-Newton method in a feedforward neural network construction algorithm , 1995, IEEE Trans. Neural Networks.

[27]  Lawrence J. Fogel,et al.  Artificial Intelligence through Simulated Evolution , 1966 .

[28]  Kazuyuki Murase,et al.  A new algorithm to design compact two-hidden-layer artificial neural networks , 2001, Neural Networks.

[29]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[30]  Yann LeCun,et al.  Optimal Brain Damage , 1989, NIPS.

[31]  C. Lee Giles,et al.  An analysis of noise in recurrent neural networks: convergence and generalization , 1996, IEEE Trans. Neural Networks.

[32]  Xin Yao,et al.  A new evolutionary system for evolving artificial neural networks , 1997, IEEE Trans. Neural Networks.

[33]  Andries Petrus Engelbrecht,et al.  A new pruning heuristic based on variance analysis of sensitivity information , 2001, IEEE Trans. Neural Networks.

[34]  F. Heimes,et al.  Traditional and evolved dynamic neural networks for aircraft simulation , 1997, 1997 IEEE International Conference on Systems, Man, and Cybernetics. Computational Cybernetics and Simulation.

[35]  Xin Yao,et al.  A constructive algorithm for training cooperative neural network ensembles , 2003, IEEE Trans. Neural Networks.

[36]  Roberto Togneri,et al.  Modelling 1-D signals using Hermite basis functions , 1997 .

[37]  Fred W. Glover,et al.  Future paths for integer programming and links to artificial intelligence , 1986, Comput. Oper. Res..

[38]  Dušan Petrovački,et al.  Evolutional development of a multilevel neural network , 1993, Neural Networks.

[39]  Timur Ash,et al.  Dynamic node creation in backpropagation networks , 1989 .

[40]  Xin Yao,et al.  A review of evolutionary artificial neural networks , 1993, Int. J. Intell. Syst..

[41]  J. D. Schaffer,et al.  Combinations of genetic algorithms and neural networks: a survey of the state of the art , 1992, [Proceedings] COGANN-92: International Workshop on Combinations of Genetic Algorithms and Neural Networks.

[42]  Teresa Bernarda Ludermir,et al.  An Optimization Methodology for Neural Network Weights and Architectures , 2006, IEEE Transactions on Neural Networks.

[43]  Robert A. Jacobs,et al.  Increased rates of convergence through learning rate adaptation , 1987, Neural Networks.

[44]  Yoshua Bengio,et al.  Learning a synaptic learning rule , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.

[45]  Kazuyuki Murase,et al.  A New Constructive Algorithm for Designing and Training Artificial Neural Networks , 2007, ICONIP.

[46]  Harry Wechsler,et al.  From Statistics to Neural Networks: Theory and Pattern Recognition Applications , 1996 .

[47]  David J. C. MacKay,et al.  A Practical Bayesian Framework for Backpropagation Networks , 1992, Neural Computation.

[48]  Léon Personnaz,et al.  Neural-network construction and selection in nonlinear modeling , 2003, IEEE Trans. Neural Networks.

[49]  Derong Liu,et al.  A constructive algorithm for feedforward neural networks with incremental training , 2002 .

[50]  Kazuyuki Murase,et al.  A New Algorithm to Design Multiple Hidden Layered Artificial Neural Networks , 2006 .

[51]  Jooyoung Park,et al.  Approximation and Radial-Basis-Function Networks , 1993, Neural Computation.

[52]  Tomaso A. Poggio,et al.  Regularization Theory and Neural Networks Architectures , 1995, Neural Computation.

[53]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[54]  L. Darrell Whitley,et al.  Genetic algorithms and neural networks: optimizing connections and connectivity , 1990, Parallel Comput..

[55]  David J. Chalmers,et al.  The Evolution of Learning: An Experiment in Genetic Connectionism , 1991 .

[56]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[57]  Lutz Prechelt,et al.  A quantitative study of experimental evaluations of neural network learning algorithms: Current research practice , 1996, Neural Networks.

[58]  Mike Wynne-Jones,et al.  Node splitting: A constructive algorithm for feed-forward neural networks , 1991, Neural Computing & Applications.

[59]  Aaas News,et al.  Book Reviews , 1893, Buffalo Medical and Surgical Journal.

[60]  Xin Yao,et al.  Evolving artificial neural networks , 1999, Proc. IEEE.

[61]  Lutz Prechelt,et al.  PROBEN 1 - a set of benchmarks and benchmarking rules for neural network training algorithms , 1994 .

[62]  Md. Monirul Islam,et al.  An algorithm for automatic design of two hidden layered artificial neural networks , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[63]  Lutz Prechelt,et al.  Automatic early stopping using cross validation: quantifying the criteria , 1998, Neural Networks.

[64]  Russell Reed,et al.  Pruning algorithms-a survey , 1993, IEEE Trans. Neural Networks.

[65]  Mikko Lehtokangas,et al.  Fast initialization for cascade-correlation learning , 1999, IEEE Trans. Neural Networks.

[66]  Xin Yao,et al.  A Preliminary Study on Designing Artiicial Neural Networks Using Co-evolution , 1995 .

[67]  Oded Maimon,et al.  A Distributed Genetic Algorithm for Neural Network Design and Training , 1992, Complex Syst..

[68]  Christian Lebiere,et al.  The Cascade-Correlation Learning Architecture , 1989, NIPS.

[69]  Jan Torreele,et al.  Temporal Processing with Recurrent Networks: An Evolutionary Approach , 1991, ICGA.

[70]  David E. Rumelhart,et al.  Generalization by Weight-Elimination with Application to Forecasting , 1990, NIPS.

[71]  Peter J. Angeline,et al.  An evolutionary algorithm that constructs recurrent neural networks , 1994, IEEE Trans. Neural Networks.

[72]  Pierre Baldi,et al.  Temporal Evolution of Generalization during Learning in Linear Networks , 1991, Neural Computation.

[73]  Yves Chauvin,et al.  Generalization Performance of Overtrained Back-Propagation Networks , 1990, EURASIP Workshop.

[74]  Stefan Bornholdt,et al.  General asymmetric neural networks and structure design by genetic algorithms: a learning rule for temporal patterns , 1992, Proceedings of IEEE Systems Man and Cybernetics Conference - SMC.

[75]  Tom Downs,et al.  CARVE-a constructive algorithm for real-valued examples , 1998, IEEE Trans. Neural Networks.

[76]  Thomas Bäck,et al.  Evolutionary computation: Toward a new philosophy of machine intelligence , 1997, Complex..

[77]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[78]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[79]  Yamashita,et al.  Backpropagation algorithm which varies the number of hidden units , 1989 .

[80]  Khashayar Khorasani,et al.  Constructive feedforward neural networks using Hermite polynomial activation functions , 2005, IEEE Transactions on Neural Networks.

[81]  Kazuyuki Murase,et al.  An adaptive merging and growing algorithm for designing artificial neural networks , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[82]  M. Mandischer Evolving recurrent neural networks with non-binary encoding , 1995, Proceedings of 1995 IEEE International Conference on Evolutionary Computation.

[83]  Albert Y. Zomaya,et al.  Toward generating neural network structures for function approximation , 1994, Neural Networks.