Accelerating Deep Learning with Shrinkage and Recall

Deep Learning is a very powerful machine learning model. Deep Learning trains a large number of parameters for multiple layers and is very slow when data is in large scale and the architecture size is large. Inspired from the shrinking technique used in accelerating computation of Support Vector Machines (SVM) algorithm and screening technique used in LASSO, we propose a shrinking Deep Learning with recall (sDLr) approach to speed up deep learning computation. We experiment shrinking Deep Learning with recall (sDLr) using Deep Neural Network (DNN), Deep Belief Network (DBN) and Convolution Neural Network (CNN) on 4 data sets. Results show that the speedup using shrinking Deep Learning with recall (sDLr) can reach more than 2.0 while still giving competitive classification performance.

[1]  Jeyanthi Narasimhan,et al.  Fast Support Vector Machines Using Parallel Adaptive Shrinking on Distributed Systems , 2014, ArXiv.

[2]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[3]  Yoshua Bengio,et al.  Convolutional networks for images, speech, and time series , 1998 .

[4]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[5]  Everette S. Gardner,et al.  Exponential smoothing: The state of the art , 1985 .

[6]  Rasmus Berg Palm,et al.  Prediction as a candidate for learning deep hierarchical models of data , 2012 .

[7]  Simon Haykin,et al.  GradientBased Learning Applied to Document Recognition , 2001 .

[8]  Chris H. Q. Ding,et al.  Kernel Alignment Inspired Linear Discriminant Analysis , 2014, ECML/PKDD.

[9]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[10]  Geoffrey E. Hinton,et al.  Deep Boltzmann Machines , 2009, AISTATS.

[11]  Rémi Gribonval,et al.  A dynamic screening principle for the Lasso , 2014, 2014 22nd European Signal Processing Conference (EUSIPCO).

[12]  Li Deng,et al.  Three Classes of Deep Learning Architectures and Their Applications: A Tutorial Survey , 2012 .

[13]  Pascal Vincent,et al.  Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..

[14]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[15]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[16]  Jie Wang,et al.  Lasso screening rules via dual polytope projection , 2012, J. Mach. Learn. Res..

[17]  Andrea Vedaldi,et al.  MatConvNet: Convolutional Neural Networks for MATLAB , 2014, ACM Multimedia.

[18]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[19]  Feiping Nie,et al.  A Closed Form Solution to Multi-View Low-Rank Regression , 2015, AAAI.

[20]  Liana L. Fong,et al.  Analysis and Modeling of Social Influence in High Performance Computing Workloads , 2011, Euro-Par.

[21]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[22]  Brendan J. Frey,et al.  Bayesian prediction of tissue-regulated splicing using RNA sequence and cellular context , 2011, Bioinform..

[23]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[24]  Xiangliang Zhang,et al.  Virtual machine migration in an over-committed cloud , 2012, 2012 IEEE Network Operations and Management Symposium.

[25]  P. Baldi,et al.  Searching for exotic particles in high-energy physics with deep learning , 2014, Nature Communications.

[26]  VincentPascal,et al.  Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010 .

[27]  J. Manyika Big data: The next frontier for innovation, competition, and productivity , 2011 .

[28]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[29]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[30]  Xiangliang Zhang,et al.  TideWatch: Fingerprinting the cyclicality of big data workloads , 2014, IEEE INFOCOM 2014 - IEEE Conference on Computer Communications.

[31]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..