A New Constructive Algorithm for Architectural and Functional Adaptation of Artificial Neural Networks

The generalization ability of artificial neural networks (ANNs) is greatly dependent on their architectures. Constructive algorithms provide an attractive automatic way of determining a near-optimal ANN architecture for a given problem. Several such algorithms have been proposed in the literature and shown their effectiveness. This paper presents a new constructive algorithm (NCA) in automatically determining ANN architectures. Unlike most previous studies on determining ANN architectures, NCA puts emphasis on architectural adaptation and functional adaptation in its architecture determination process. It uses a constructive approach to determine the number of hidden layers in an ANN and of neurons in each hidden layer. To achieve functional adaptation, NCA trains hidden neurons in the ANN by using different training sets that were created by employing a similar concept used in the boosting algorithm. The purpose of using different training sets is to encourage hidden neurons to learn different parts or aspects of the training data so that the ANN can learn the whole training data in a better way. In this paper, the convergence and computational issues of NCA are analytically studied. The computational complexity of NCA is found to be O(W times Pt times tau), where W is the number of weights in the ANN, Pt is the number of training examples, and tau is the number of training epochs. This complexity has the same order as what the backpropagation learning algorithm requires for training a fixed ANN architecture. A set of eight classification and two approximation benchmark problems was used to evaluate the performance of NCA. The experimental results show that NCA can produce ANN architectures with fewer hidden neurons and better generalization ability compared to existing constructive and nonconstructive algorithms.

[1]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[2]  Babak Hassibi,et al.  Second Order Derivatives for Network Pruning: Optimal Brain Surgeon , 1992, NIPS.

[3]  Robert E. Schapire,et al.  The strength of weak learnability , 1990, Mach. Learn..

[4]  Peter L. Bartlett,et al.  The Sample Complexity of Pattern Classification with Neural Networks: The Size of the Weights is More Important than the Size of the Network , 1998, IEEE Trans. Inf. Theory.

[5]  Mikko Lehtokangas,et al.  Modified cascade-correlation learning for classification , 2000, IEEE Trans. Neural Networks Learn. Syst..

[6]  Alan F. Murray,et al.  Enhanced MLP performance and fault tolerance resulting from synaptic weight noise during training , 1994, IEEE Trans. Neural Networks.

[7]  Teresa Bernarda Ludermir,et al.  An Optimization Methodology for Neural Network Weights and Architectures , 2006, IEEE Transactions on Neural Networks.

[8]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[9]  Kazuyuki Murase,et al.  A New Constructive Algorithm for Designing and Training Artificial Neural Networks , 2007, ICONIP.

[10]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[11]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[12]  Kurt Hornik,et al.  Approximation capabilities of multilayer feedforward networks , 1991, Neural Networks.

[13]  L. Marshall Commonwealth Scientific and Industrial Research Organization , 1953, Nature.

[14]  E. Kreyszig Introductory Functional Analysis With Applications , 1978 .

[15]  J. Nadal,et al.  Learning in feedforward layered networks: the tiling algorithm , 1989 .

[16]  Lutz Prechelt,et al.  A quantitative study of experimental evaluations of neural network learning algorithms: Current research practice , 1996, Neural Networks.

[17]  Xin Yao,et al.  A New Adaptive Merging and Growing Algorithm for Designing Artificial Neural Networks , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[18]  Scott E. Fahlman,et al.  An empirical study of learning speed in back-propagation networks , 1988 .

[19]  Léon Personnaz,et al.  Neural-network construction and selection in nonlinear modeling , 2003, IEEE Trans. Neural Networks.

[20]  Yann LeCun,et al.  Optimal Brain Damage , 1989, NIPS.

[21]  Xin Yao,et al.  A new evolutionary system for evolving artificial neural networks , 1997, IEEE Trans. Neural Networks.

[22]  Robi Polikar,et al.  An Ensemble-Based Incremental Learning Approach to Data Fusion , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[23]  Kwok-Wo Wong,et al.  A pruning method for the recursive least squared algorithm , 2001, Neural Networks.

[24]  Christian Lebiere,et al.  The Cascade-Correlation Learning Architecture , 1989, NIPS.

[25]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[26]  Yoshua Bengio,et al.  An empirical evaluation of deep architectures on problems with many factors of variation , 2007, ICML '07.

[27]  C. L. Philip Chen,et al.  Regularization parameter estimation for feedforward neural networks , 2003 .

[28]  Xin Yao,et al.  Bagging and Boosting Negatively Correlated Neural Networks , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[29]  Peter Norvig,et al.  Artificial intelligence - a modern approach, 2nd Edition , 2003, Prentice Hall series in artificial intelligence.

[30]  Andries Petrus Engelbrecht,et al.  A new pruning heuristic based on variance analysis of sensitivity information , 2001, IEEE Trans. Neural Networks.

[31]  Jihoon Yang,et al.  Constructive neural-network learning algorithms for pattern classification , 2000, IEEE Trans. Neural Networks Learn. Syst..

[32]  James T. Kwok,et al.  Constructive algorithms for structure learning in feedforward neural networks for regression problems , 1997, IEEE Trans. Neural Networks.

[33]  Kazuyuki Murase,et al.  A new algorithm to design compact two-hidden-layer artificial neural networks , 2001, Neural Networks.

[34]  Jacques de Villiers,et al.  Backpropagation neural nets with one and two hidden layers , 1993, IEEE Trans. Neural Networks.

[35]  Fred W. Glover,et al.  Future paths for integer programming and links to artificial intelligence , 1986, Comput. Oper. Res..

[36]  Alfred Jean Philippe Lauret,et al.  A node pruning algorithm based on a Fourier amplitude sensitivity test method , 2006, IEEE Transactions on Neural Networks.

[37]  James T. Kwok,et al.  Objective functions for training new hidden units in constructive neural networks , 1997, IEEE Trans. Neural Networks.

[38]  Sung-Bae Cho,et al.  Evolutionary neural networks for anomaly detection based on the behavior of a program , 2005, IEEE Trans. Syst. Man Cybern. Part B.

[39]  Lutz Prechelt,et al.  Automatic early stopping using cross validation: quantifying the criteria , 1998, Neural Networks.

[40]  Russell Reed,et al.  Pruning algorithms-a survey , 1993, IEEE Trans. Neural Networks.

[41]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[42]  Lutz Prechelt,et al.  PROBEN 1 - a set of benchmarks and benchmarking rules for neural network training algorithms , 1994 .

[43]  Roberto Togneri,et al.  Modelling 1-D signals using Hermite basis functions , 1997 .

[44]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[45]  Jia-Dong Ren,et al.  The research on an algorithm for special two-hidden layer artificial neural networks , 2003, Proceedings of the 2003 International Conference on Machine Learning and Cybernetics (IEEE Cat. No.03EX693).

[46]  Dhananjay S. Phatak,et al.  Connectivity and performance tradeoffs in the cascade correlation learning architecture , 1994, IEEE Trans. Neural Networks.

[47]  Alexander I. Galushkin,et al.  Neural Network Theory , 2007 .

[48]  Rudy Setiono,et al.  Use of a quasi-Newton method in a feedforward neural network construction algorithm , 1995, IEEE Trans. Neural Networks.

[49]  Shin'ichi Tamura,et al.  Capabilities of a four-layered feedforward neural network: four layers versus three , 1997, IEEE Trans. Neural Networks.

[50]  Alexander I. Galushkin,et al.  Neural Networks Theory , 2007 .

[51]  Khashayar Khorasani,et al.  Constructive feedforward neural networks using Hermite polynomial activation functions , 2005, IEEE Transactions on Neural Networks.