论文信息 - ImageNet classification with deep convolutional neural networks

ImageNet classification with deep convolutional neural networks

We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0%, respectively, which is considerably better than the previous state-of-the-art. The neural network, which has 60 million parameters and 650,000 neurons, consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully connected layers with a final 1000-way softmax. To make training faster, we used non-saturating neurons and a very efficient GPU implementation of the convolution operation. To reduce overfitting in the fully connected layers we employed a recently developed regularization method called "dropout" that proved to be very effective. We also entered a variant of this model in the ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%, compared to 26.2% achieved by the second-best entry.

[1] P. Werbos,et al. Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .

[2] S. Linnainmaa. Taylor expansion of the accumulated rounding error , 1976 .

[3] Yann LeCun,et al. Une procedure d'apprentissage pour reseau a seuil asymmetrique (A learning scheme for asymmetric threshold networks) , 1985 .

[4] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .

[5] Lawrence D. Jackel,et al. Handwritten Digit Recognition with a Back-Propagation Network , 1989, NIPS.

[6] Patrice Y. Simard,et al. Best practices for convolutional neural networks applied to visual document analysis , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[7] Leo Breiman,et al. Random Forests , 2001, Machine Learning.

[8] Kunihiko Fukushima,et al. Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position , 1980, Biological Cybernetics.

[9] Pietro Perona,et al. Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[10] Y. LeCun,et al. Learning methods for generic object recognition with invariance to pose and lighting , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[11] Antonio Torralba,et al. LabelMe: A Database and Web-Based Tool for Image Annotation , 2008, International Journal of Computer Vision.

[12] G. Griffin,et al. Caltech-256 Object Category Dataset , 2007 .

[13] Yehuda Koren,et al. Lessons from the Netflix prize challenge , 2007, SKDD.

[14] Nicolas Pinto,et al. Why is Real-World Visual Object Recognition Hard? , 2008, PLoS Comput. Biol..

[15] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[16] Honglak Lee,et al. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations , 2009, ICML '09.

[17] Yann LeCun,et al. What is the best multi-stage architecture for object recognition? , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[18] Alexander Toshev,et al. Shape-based object recognition in videos using 3D synthetic object models , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[19] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .

[20] David D. Cox,et al. A High-Throughput Screening Approach to Discovering Good Forms of Biologically Inspired Visual Representation , 2009, PLoS Comput. Biol..

[21] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[22] A. Krizhevsky. Convolutional Deep Belief Networks on CIFAR-10 , 2010 .

[23] Yann LeCun,et al. Convolutional networks and applications in vision , 2010, Proceedings of 2010 IEEE International Symposium on Circuits and Systems.

[24] Francisco Javier de Cos Juez,et al. Artificial neural networks applied to cancer detection in a breast screening programme , 2010, Math. Comput. Model..

[25] Joseph F. Murray,et al. Convolutional Networks Can Learn to Generate Affinity Graphs for Image Segmentation , 2010, Neural Computation.

[26] Luca Maria Gambardella,et al. High-Performance Neural Networks for Visual Object Classification , 2011, ArXiv.

[27] Geoffrey E. Hinton,et al. Using very deep autoencoders for content-based image retrieval , 2011, ESANN.

[28] Florent Perronnin,et al. High-dimensional signature compression for large-scale image classification , 2011, CVPR 2011.

[29] Nitish Srivastava,et al. Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[30] Jürgen Schmidhuber,et al. Multi-column deep neural networks for image classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[31] Gabriela Csurka,et al. Metric Learning for Large Scale Image Classification: Generalizing to New Classes at Near-Zero Cost , 2012, ECCV.

[32] Jürgen Schmidhuber,et al. Multi-column deep neural network for traffic sign classification , 2012, Neural Networks.