Fitness Landscape Analysis of Weight-Elimination Neural Networks

Neural network architectures can be regularised by adding a penalty term to the objective function, thus minimising network complexity in addition to the error. However, adding a term to the objective function inevitably changes the surface of the objective function. This study investigates the landscape changes induced by the weight elimination penalty function under various parameter settings. Fitness landscape metrics are used to quantify and visualise the induced landscape changes, as well as to propose sensible ranges for the regularisation parameters. Fitness landscape metrics are shown to be a viable tool for neural network objective function landscape analysis and visualisation.

[1]  David E. Rumelhart,et al.  Generalization by Weight-Elimination with Application to Forecasting , 1990, NIPS.

[2]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[3]  E. Wegman Hyperdimensional Data Analysis Using Parallel Coordinates , 1990 .

[4]  Michael Affenzeller,et al.  A Comprehensive Survey on Fitness Landscape Analysis , 2012, Recent Advances in Intelligent Engineering Systems.

[5]  A. J. Nijman,et al.  PARLE Parallel Architectures and Languages Europe , 1987, Lecture Notes in Computer Science.

[6]  Andries Petrus Engelbrecht,et al.  Weight regularisation in particle swarm optimisation neural network training , 2014, 2014 IEEE Symposium on Swarm Intelligence.

[7]  Amit Gupta,et al.  Weight decay backpropagation for noisy data , 1998, Neural Networks.

[8]  Tim Jones Evolutionary Algorithms, Fitness Landscapes and Search , 1995 .

[9]  Andries Petrus Engelbrecht,et al.  A survey of techniques for characterising fitness landscapes and some possible ways forward , 2013, Inf. Sci..

[10]  Marcus Gallagher,et al.  Multi-layer Perceptron Error Surfaces: Visualization, Structure and Modelling , 2000 .

[11]  Don R. Hush,et al.  Error surfaces for multilayer perceptrons , 1992, IEEE Trans. Syst. Man Cybern..

[12]  Bernd Bischl,et al.  Exploratory landscape analysis , 2011, GECCO '11.

[13]  W. Cleveland,et al.  Locally Weighted Regression: An Approach to Regression Analysis by Local Fitting , 1988 .

[14]  Andries Engelbrecht,et al.  Analysis of error landscapes in multi-layered neural networks for classification , 2016, 2016 IEEE Congress on Evolutionary Computation (CEC).

[15]  Lutz Prechelt,et al.  PROBEN 1 - a set of benchmarks and benchmarking rules for neural network training algorithms , 1994 .

[16]  Andries Petrus Engelbrecht,et al.  Search space boundaries in neural network error landscape analysis , 2016, 2016 IEEE Symposium Series on Computational Intelligence (SSCI).

[17]  Andries Petrus Engelbrecht,et al.  Characterising the searchability of continuous optimisation problems for PSO , 2014, Swarm Intelligence.

[18]  Urska Cvek,et al.  High-Dimensional Visualizations , 2002 .

[19]  Andries Petrus Engelbrecht,et al.  Saturation in PSO neural network training: Good or evil? , 2015, 2015 IEEE Congress on Evolutionary Computation (CEC).

[20]  Geoffrey E. Hinton Learning Translation Invariant Recognition in Massively Parallel Networks , 1987, PARLE.

[21]  Teresa Bernarda Ludermir,et al.  Particle Swarm Optimization of Feed-Forward Neural Networks with Weight Decay , 2006, 2006 Sixth International Conference on Hybrid Intelligent Systems (HIS'06).

[22]  Leonard G. C. Hamey,et al.  XOR has no local minima: A case study in neural network error surface analysis , 1998, Neural Networks.

[23]  Mario A. Muñoz,et al.  Algorithm selection for black-box continuous optimization problems: A survey on methods and challenges , 2015, Inf. Sci..

[24]  P. Werbos,et al.  Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .

[25]  Klaus-Robert Müller,et al.  Efficient BackProp , 2012, Neural Networks: Tricks of the Trade.

[26]  D. Rumelhart,et al.  Generalization by weight-elimination applied to currency exchange rate prediction , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.

[27]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[28]  Andries Petrus Engelbrecht,et al.  A progressive random walk algorithm for sampling continuous fitness landscapes , 2014, 2014 IEEE Congress on Evolutionary Computation (CEC).

[29]  Andries Petrus Engelbrecht,et al.  Ruggedness, funnels and gradients in fitness landscapes and the effect on PSO performance , 2013, 2013 IEEE Congress on Evolutionary Computation.

[30]  George W. Irwin,et al.  Improving neural network training solutions using regularisation , 2001, Neurocomputing.

[31]  Yann LeCun,et al.  The Loss Surfaces of Multilayer Networks , 2014, AISTATS.

[32]  John E. Moody,et al.  The Effective Number of Parameters: An Analysis of Generalization and Regularization in Nonlinear Learning Systems , 1991, NIPS.

[33]  Carlos M. Fonseca,et al.  Semantic Learning Machine: A Feedforward Neural Network Construction Algorithm Inspired by Geometric Semantic Genetic Programming , 2015, EPIA.

[34]  Katherine M. Malan Characterising continuous optimisation problems for particle swarm optimisation performance prediction , 2014 .

[35]  Yann LeCun,et al.  Open Problem: The landscape of the loss surfaces of multilayer networks , 2015, COLT.

[36]  Anders Krogh,et al.  A Simple Weight Decay Can Improve Generalization , 1991, NIPS.

[37]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[38]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[39]  Terry Jones,et al.  Fitness Distance Correlation as a Measure of Problem Difficulty for Genetic Algorithms , 1995, ICGA.

[40]  Marcus Gallagher,et al.  Fitness Distance Correlation of Neural Network Error Surfaces: A Scalable, Continuous Optimization Problem , 2001, ECML.

[41]  Andries Petrus Engelbrecht,et al.  Quantifying ruggedness of continuous landscapes using entropy , 2009, 2009 IEEE Congress on Evolutionary Computation.

[42]  Tomaso A. Poggio,et al.  Regularization Theory and Neural Networks Architectures , 1995, Neural Computation.

[43]  Gérard Dreyfus,et al.  Neural networks - methodology and applications , 2005 .

[44]  Surya Ganguli,et al.  Identifying and attacking the saddle point problem in high-dimensional non-convex optimization , 2014, NIPS.

[45]  Mirosław Kordos,et al.  A survey of factors influencing MLP error surface , 2004 .

[46]  Jacek M. Zurada,et al.  Boundedness and convergence analysis of weight elimination for cyclic training of neural networks , 2016, Neural Networks.

[47]  Bernd Freisleben,et al.  Fitness landscape analysis and memetic algorithms for the quadratic assignment problem , 2000, IEEE Trans. Evol. Comput..

[48]  Grgoire Montavon,et al.  Neural Networks: Tricks of the Trade , 2012, Lecture Notes in Computer Science.