Convolution by Evolution: Differentiable Pattern Producing Networks

In this work we introduce a differentiable version of the Compositional Pattern Producing Network, called the DPPN. Unlike a standard CPPN, the topology of a DPPN is evolved but the weights are learned. A Lamarckian algorithm, that combines evolution and learning, produces DPPNs to reconstruct an image. Our main result is that DPPNs can be evolved/trained to compress the weights of a denoising autoencoder from 157684 to roughly 200 parameters, while achieving a reconstruction accuracy comparable to a fully connected network with more than two orders of magnitude more parameters. The regularization ability of the DPPN allows it to rediscover (approximate) convolutional network architectures embedded within a fully connected architecture. Such convolutional architectures are the current state of the art for many computer vision applications, so it is satisfying that DPPNs are capable of discovering this structure rather than having to build it in by design. DPPNs exhibit better generalization when tested on the Omniglot dataset after being trained on MNIST, than directly encoded fully connected autoencoders. DPPNs are therefore a new framework for integrating learning and evolution.

[1]  D. Hubel,et al.  Receptive fields, binocular interaction and functional architecture in the cat's visual cortex , 1962, The Journal of physiology.

[2]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[3]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[4]  D. J. Felleman,et al.  Distributed hierarchical processing in the primate cerebral cortex. , 1991, Cerebral cortex.

[5]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[6]  Xin Yao,et al.  Evolutionary programming made faster , 1999, IEEE Trans. Evol. Comput..

[7]  Risto Miikkulainen,et al.  Evolving Neural Networks through Augmenting Topologies , 2002, Evolutionary Computation.

[8]  Kunihiko Fukushima,et al.  Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position , 1980, Biological Cybernetics.

[9]  Kenneth O. Stanley,et al.  Compositional Pattern Producing Networks : A Novel Abstraction of Development , 2007 .

[10]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[11]  Kenneth O. Stanley,et al.  A Hypercube-Based Encoding for Evolving Large-Scale Neural Networks , 2009, Artificial Life.

[12]  Inman Harvey,et al.  The Microbial Genetic Algorithm , 2009, ECAL.

[13]  Julian Togelius,et al.  Evolving Memory Cell Structures for Sequence Learning , 2009, ICANN.

[14]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[15]  Kenneth O. Stanley,et al.  Picbreeder: A Case Study in Collaborative Evolutionary Exploration of Design Space , 2011, Evolutionary Computation.

[16]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[17]  Kenneth O. Stanley,et al.  Evolving multimodal controllers with HyperNEAT , 2013, GECCO '13.

[18]  Alex Graves,et al.  Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.

[19]  Sebastian Risi,et al.  Single-unit pattern generators for quadruped locomotion , 2013, GECCO '13.

[20]  Misha Denil,et al.  Predicting Parameters in Deep Learning , 2014 .

[21]  Josh Harguess,et al.  Generative NeuroEvolution for Deep Learning , 2013, ArXiv.

[22]  Andrew Zisserman,et al.  Speeding up Convolutional Neural Networks with Low Rank Expansions , 2014, BMVC.

[23]  Mauro Santos,et al.  Phenotypic plasticity, the Baldwin effect, and the speeding up of evolution: the computational roots of an illusion. , 2014, Journal of theoretical biology.

[24]  Le Song,et al.  Deep Fried Convnets , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[25]  Iasonas Kokkinos,et al.  Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs , 2014, ICLR.

[26]  Jason Yosinski,et al.  Deep neural networks are easily fooled: High confidence predictions for unrecognizable images , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Wojciech Zaremba,et al.  An Empirical Exploration of Recurrent Network Architectures , 2015, ICML.

[28]  Joshua B. Tenenbaum,et al.  Human-level concept learning through probabilistic program induction , 2015, Science.

[29]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[30]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[31]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Lukasz Kaiser,et al.  Neural GPUs Learn Algorithms , 2015, ICLR.