Evolutionary neural AutoML for deep learning

Deep neural networks (DNNs) have produced state-of-the-art results in many benchmarks and problem domains. However, the success of DNNs depends on the proper configuration of its architecture and hyperparameters. Such a configuration is difficult and as a result, DNNs are often not used to their full potential. In addition, DNNs in commercial applications often need to satisfy real-world design constraints such as size or number of parameters. To make configuration easier, automatic machine learning (AutoML) systems for deep learning have been developed, focusing mostly on optimization of hyperparameters. This paper takes AutoML a step further. It introduces an evolutionary AutoML framework called LEAF that not only optimizes hyperparameters but also network architectures and the size of the network. LEAF makes use of both state-of-the-art evolutionary algorithms (EAs) and distributed computing frameworks. Experimental results on medical image classification and natural language analysis show that the framework can be used to achieve state-of-the-art performance. In particular, LEAF demonstrates that architecture optimization provides a significant boost over hyperparameter optimization, and that networks can be minimized at the same time with little drop in performance. LEAF therefore forms a foundation for democratizing and improving AI, as well as making AI practical in future applications.

[1]  Risto Miikkulainen,et al.  Evolving Neural Networks through Augmenting Topologies , 2002, Evolutionary Computation.

[2]  Risto Miikkulainen,et al.  COOPERATIVE COEVOLUTION OF MULTI-AGENT SYSTEMS , 2001 .

[3]  Oriol Vinyals,et al.  Hierarchical Representations for Efficient Architecture Search , 2017, ICLR.

[4]  Risto Miikkulainen,et al.  Solving Non-Markovian Control Tasks with Neuro-Evolution , 1999, IJCAI.

[5]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Kalyanmoy Deb,et al.  NSGA-Net: neural architecture search using multi-objective genetic algorithm , 2018, GECCO.

[7]  Elliot Meyerson,et al.  Beyond Shared Hierarchies: Deep Multitask Learning through Soft Layer Ordering , 2017, ICLR.

[8]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[9]  Hitoshi Iba,et al.  Multi-agent Robot Learning by Means of Genetic Programming: Solving an Escape Problem , 2001, ICES.

[10]  Elliot Meyerson,et al.  Evolving Deep Neural Networks , 2017, Artificial Intelligence in the Age of Neural Networks and Brain Computing.

[11]  Prabhat,et al.  Scalable Bayesian Optimization Using Deep Neural Networks , 2015, ICML.

[12]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[13]  M. Köppen,et al.  The Curse of Dimensionality , 2010 .

[14]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[15]  Frank Hutter,et al.  Efficient Multi-Objective Neural Architecture Search via Lamarckian Evolution , 2018, ICLR.

[16]  Quoc V. Le,et al.  Neural Architecture Search with Reinforcement Learning , 2016, ICLR.

[17]  Miao Sun Large scale image classification and object detection , 2016 .

[18]  Risto Miikkulainen,et al.  Hierarchical evolution of neural networks , 1998, 1998 IEEE International Conference on Evolutionary Computation Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98TH8360).

[19]  Elliot Meyerson,et al.  Evolutionary architecture search for deep multitask networks , 2018, GECCO.

[20]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[21]  Alok Aggarwal,et al.  Regularized Evolution for Image Classifier Architecture Search , 2018, AAAI.

[22]  Qingfu Zhang,et al.  Multiobjective evolutionary algorithms: A survey of the state of the art , 2011, Swarm Evol. Comput..

[23]  Masanori Suganuma,et al.  A genetic programming approach to designing convolutional neural network architectures , 2017, GECCO.

[24]  Quoc V. Le,et al.  Large-Scale Evolution of Image Classifiers , 2017, ICML.

[25]  Yoshua Bengio,et al.  Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..

[26]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Kenneth A. De Jong,et al.  A Cooperative Coevolutionary Approach to Function Optimization , 1994, PPSN.

[28]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[29]  Kalyanmoy Deb,et al.  Multi-Objective Evolutionary Algorithms , 2015, Handbook of Computational Intelligence.

[30]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Ronald M. Summers,et al.  ChestX-ray: Hospital-Scale Chest X-ray Database and Benchmarks on Weakly Supervised Classification and Localization of Common Thorax Diseases , 2019, Deep Learning and Convolutional Neural Networks for Medical Imaging and Clinical Informatics.

[32]  Pascal Vincent,et al.  Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..

[33]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[34]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[35]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[36]  Matthew J. Hausknecht,et al.  Beyond short snippets: Deep networks for video classification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Frank Hutter,et al.  CMA-ES for Hyperparameter Optimization of Deep Neural Networks , 2016, ArXiv.

[38]  Kalyanmoy Deb,et al.  NSGA-NET: A Multi-Objective Genetic Algorithm for Neural Architecture Search , 2018, ArXiv.

[39]  Risto Miikkulainen,et al.  Accelerated Neural Evolution through Cooperatively Coevolved Synapses , 2008, J. Mach. Learn. Res..

[40]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[41]  Lukás Burget,et al.  Recurrent neural network based language model , 2010, INTERSPEECH.

[42]  Yan Liu,et al.  Recurrent Neural Networks for Multivariate Time Series with Missing Values , 2016, Scientific Reports.

[43]  Abdul Kadir Leaf Identification Using Fourier Descriptors and Other Shape Features , 2015, CVPR 2015.

[44]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[45]  Theodore Chu,et al.  Comment Abuse Classification with Deep Learning , 2017 .

[46]  Andrew Y. Ng,et al.  CheXNet: Radiologist-Level Pneumonia Detection on Chest X-Rays with Deep Learning , 2017, ArXiv.