On the Transferability of Winning Tickets in Non-Natural Image Datasets

We study the generalization properties of pruned neural networks that are the winners of the lottery ticket hypothesis on datasets of natural images. We analyse their potential under conditions in which training data is scarce and comes from a non-natural domain. Specifically, we investigate whether pruned models that are found on the popular CIFAR-10/100 and Fashion-MNIST datasets, generalize to seven different datasets that come from the fields of digital pathology and digital heritage. Our results show that there are significant benefits in transferring and training sparse architectures over larger parametrized models, since in all of our experiments pruned networks, winners of the lottery ticket hypothesis, significantly outperform their larger unpruned counterparts. These results suggest that winning initializations do contain inductive biases that are generic to some extent, although, as reported by our experiments on the biomedical datasets, their generalization properties can be more limiting than what has been so far observed in the literature.

[1]  Michael Carbin,et al.  The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks , 2018, ICLR.

[2]  Timo Aila,et al.  Pruning Convolutional Neural Networks for Resource Efficient Inference , 2016, ICLR.

[3]  Xin Dong,et al.  Learning to Prune Deep Neural Networks via Layer-wise Optimal Brain Surgeon , 2017, NIPS.

[4]  Song Han,et al.  Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[5]  Yuandong Tian,et al.  Playing the lottery with rewards and multiple languages: lottery tickets in RL and NLP , 2019, ICLR.

[6]  Yuandong Tian,et al.  One ticket to win them all: generalizing lottery ticket initializations across datasets and optimizers , 2019, NeurIPS.

[7]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[8]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[9]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  John W. Sheppard,et al.  Using Winning Lottery Tickets in Transfer Learning for Convolutional Neural Networks , 2019, 2019 International Joint Conference on Neural Networks (IJCNN).

[12]  Jing Liu,et al.  Discrimination-aware Channel Pruning for Deep Neural Networks , 2018, NeurIPS.

[13]  Rahul Mehta,et al.  Sparse Transfer Learning via Winning Lottery Tickets , 2019, ArXiv.

[14]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[15]  Pengfei Liu,et al.  Learning Sparse Sharing Architectures for Multiple Tasks , 2019, AAAI.

[16]  Marcel Worring,et al.  OmniArt: Multi-task Deep Learning for Artistic Data Analysis , 2017, ArXiv.

[17]  Thomas Mensink,et al.  The Rijksmuseum Challenge: Museum-Centered Visual Recognition , 2014, ICMR.

[18]  Jiwen Lu,et al.  Runtime Neural Pruning , 2017, NIPS.

[19]  Varun Gohil,et al.  One ticket to win them all: generalizing lottery ticket initializations across datasets and optimizers , 2019 .

[20]  Harald Burgsteiner,et al.  Training echo state networks for rotation-invariant bone marrow cell classification , 2016, Neural Computing and Applications.

[21]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[22]  Gintare Karolina Dziugaite,et al.  Linear Mode Connectivity and the Lottery Ticket Hypothesis , 2019, ICML.

[23]  Gintare Karolina Dziugaite,et al.  Stabilizing the Lottery Ticket Hypothesis , 2019 .

[24]  Song Han,et al.  Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.

[25]  Gilles Louppe,et al.  Collaborative analysis of multi-gigapixel imaging data using Cytomine , 2016, Bioinform..

[26]  Fred Phillips,et al.  Wiki Art Gallery, Inc.: A Case for Critical Thinking , 2011 .

[27]  Raphaël Marée,et al.  Comparison of Deep Transfer Learning Strategies for Digital Pathology , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[28]  Walter Daelemans,et al.  Deep Transfer Learning for Art Classification Problems , 2018, ECCV Workshops.