论文信息 - No Multiplication? No Floating Point? No Problem! Training Networks for Efficient Inference - 字舞流文

No Multiplication? No Floating Point? No Problem! Training Networks for Efficient Inference

For successful deployment of deep neural networks on highly--resource-constrained devices (hearing aids, earbuds, wearables), we must simplify the types of operations and the memory/power resources used during inference. Completely avoiding inference-time floating-point operations is one of the simplest ways to design networks for these highly-constrained environments. By discretizing both our in-network non-linearities and our network weights, we can move to simple, compact networks without floating point operations, without multiplications, and avoid all non-linear function computations. Our approach allows us to explore the spectrum of possible networks, ranging from fully continuous versions down to networks with bi-level weights and activations. Our results show that discretization can be done without loss of performance and that we can train a network that will successfully operate without floating-point, without multiplication, and with less RAM on both regression tasks (auto encoding) and multi-class classification tasks (ImageNet). The memory needed to deploy our discretized networks is less than one third of the equivalent architecture that does use floating-point operations.

Shumeet Baluja | Michele Covell | Nick Johnston | David Marwood | S. Baluja | Michele Covell | Nick Johnston | David Marwood

[1] Joan Bruna,et al. Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation , 2014, NIPS.

[2] Tapani Raiko,et al. Techniques for Learning Binary Stochastic Feedforward Neural Networks , 2014, ICLR.

[3] Ran El-Yaniv,et al. Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations , 2016, J. Mach. Learn. Res..

[4] David Minnen,et al. Variable Rate Image Compression with Recurrent Neural Networks , 2015, ICLR.

[5] P. Werbos,et al. Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .

[6] Hanan Samet,et al. Training Quantized Nets: A Deeper Understanding , 2017, NIPS.

[7] Yoshua Bengio,et al. Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[8] Katherine Bourzac. Speck-size computers: Now with deep learning [News] , 2017 .

[9] Yann LeCun,et al. The mnist database of handwritten digits , 2005 .

[10] George D. Magoulas,et al. Training multilayer networks with discrete activation functions , 2001, IJCNN'01. International Joint Conference on Neural Networks. Proceedings (Cat. No.01CH37222).

[11] M. Kramer. Nonlinear principal component analysis using autoassociative neural networks , 1991 .

[12] Luca Benini,et al. Soft-to-Hard Vector Quantization for End-to-End Learned Compression of Images and Neural Networks , 2017, ArXiv.

[13] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[14] Ali Farhadi,et al. XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016, ECCV.

[15] Yoshua Bengio,et al. Maxout Networks , 2013, ICML.

[16] James T. Kwok,et al. Loss-aware Binarization of Deep Networks , 2016, ICLR.

[17] J. Jiang,et al. Image compression with neural networks - A survey , 1999, Signal Process. Image Commun..

[18] Song Han,et al. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[19] David Minnen,et al. Full Resolution Image Compression with Recurrent Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20] Pavel Zemcík,et al. Compression Artifacts Removal Using Convolutional Neural Networks , 2016, J. WSCG.

[21] Shuchang Zhou,et al. DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients , 2016, ArXiv.

[22] A. Krizhevsky. Convolutional Deep Belief Networks on CIFAR-10 , 2010 .

[23] Vincent Vanhoucke,et al. Improving the speed of neural networks on CPUs , 2011 .

[24] Haizhou Wang,et al. Ckmeans.1d.dp: Optimal k-means Clustering in One Dimension by Dynamic Programming , 2011, R J..

[25] Nicholas D. Lane,et al. An Early Resource Characterization of Deep Learning on Wearables, Smartphones and Internet-of-Things Devices , 2015, IoT-App@SenSys.

[26] Anil K. Jain. Data clustering: 50 years beyond K-means , 2008, Pattern Recognit. Lett..

[27] Yee Whye Teh,et al. The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.

[28] Shuang Wu,et al. Training and Inference with Integers in Deep Neural Networks , 2018, ICLR.

[29] Alexei A. Efros,et al. Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30] Luc Van Gool,et al. Conditional Probability Models for Deep Image Compression , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[31] Ruslan Salakhutdinov,et al. Learning Stochastic Feedforward Neural Networks , 2013, NIPS.

[32] Paul W. Munro,et al. Principal Components Analysis Of Images Via Back Propagation , 1988, Other Conferences.

[33] Lubomir D. Bourdev,et al. Real-Time Adaptive Image Compression , 2017, ICML.

[34] Yoshua Bengio,et al. Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation , 2013, ArXiv.

[35] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[36] G. Jenks. The Data Model Concept in Statistical Mapping , 1967 .

[37] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[38] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[39] Parul Parashar,et al. Neural Networks in Machine Learning , 2014 .

[40] Lei Deng,et al. Gated XNOR Networks: Deep Neural Networks with Ternary Weights and Activations under a Unified Discretization Framework , 2017, ArXiv.

[41] Ran El-Yaniv,et al. Binarized Neural Networks , 2016, NIPS.

[42] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[43] Valero Laparra,et al. End-to-end Optimized Image Compression , 2016, ICLR.

[44] Sachin S. Talathi,et al. Fixed Point Quantization of Deep Convolutional Networks , 2015, ICML.

[45] Shumeet Baluja,et al. Empirical Explorations in Training Networks with Discrete Activations , 2018, ArXiv.

[46] Teuvo Kohonen,et al. Learning vector quantization , 1998 .

[47] Zhou Bin,et al. A New Learning Algorithm for Neural Networks with Integer Weights and Quantized Non-linear Activation Functions , 2008, IFIP AI.

[48] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.