EXPONENTIAL LINEAR UNITS (ELUS)
暂无分享,去创建一个
[1] Kanter,et al. Eigenvalues of covariance matrices: Application to neural-network learning. , 1991, Physical review letters.
[2] Takio Kurita,et al. Iterative weighted least squares algorithms for neural networks classifiers , 1992, New Generation Computing.
[3] Shun-ichi Amari,et al. Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.
[4] Shun-ichi Amari,et al. Complexity Issues in Natural Gradient Descent Method for Training Multilayer Perceptrons , 1998, Neural Computation.
[5] Sepp Hochreiter,et al. The Vanishing Gradient Problem During Learning Recurrent Neural Nets and Problem Solutions , 1998, Int. J. Uncertain. Fuzziness Knowl. Based Syst..
[6] Nicol N. Schraudolph,et al. A Fast, Compact Approximation of the Exponential Function , 1999, Neural Computation.
[7] Jürgen Schmidhuber,et al. Feature Extraction Through LOCOCODE , 1999, Neural Computation.
[8] Yoshua Bengio,et al. Gradient Flow in Recurrent Nets: the Difficulty of Learning Long-Term Dependencies , 2001 .
[9] Nicolas Le Roux,et al. Topmoumoute Online Natural Gradient Algorithm , 2007, NIPS.
[10] James Martens,et al. Deep learning via Hessian-free optimization , 2010, ICML.
[11] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.
[12] Yoshua Bengio,et al. Deep Sparse Rectifier Neural Networks , 2011, AISTATS.
[13] Nicol N. Schraudolph,et al. Centering Neural Network Gradient Factors , 1996, Neural Networks: Tricks of the Trade.
[14] Daniel Povey,et al. Krylov Subspace Descent for Deep Learning , 2011, AISTATS.
[15] Klaus-Robert Müller,et al. Efficient BackProp , 2012, Neural Networks: Tricks of the Trade.
[16] Tapani Raiko,et al. Deep Learning Made Easier by Linear Transformations in Perceptrons , 2012, AISTATS.
[17] Andrew L. Maas. Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .
[18] Yoshua Bengio,et al. Maxout Networks , 2013, ICML.
[19] Yann Ollivier,et al. Riemannian metrics for neural networks I: feedforward networks , 2013, 1303.0818.
[20] Yangqing Jia,et al. Learning Semantic Image Representations at a Large Scale , 2014 .
[21] Benjamin Graham,et al. Fractional Max-Pooling , 2014, ArXiv.
[22] Razvan Pascanu,et al. Revisiting Natural Gradient for Deep Networks , 2013, ICLR.
[23] Thomas Brox,et al. Striving for Simplicity: The All Convolutional Net , 2014, ICLR.
[24] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[25] Ruslan Salakhutdinov,et al. Scaling up Natural Gradient by Sparsely Factorizing the Inverse Fisher Matrix , 2015, ICML.
[26] Sepp Hochreiter,et al. Toxicity Prediction using Deep Learning , 2015, ArXiv.
[27] Tianqi Chen,et al. Empirical Evaluation of Rectified Activations in Convolutional Network , 2015, ArXiv.
[28] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[29] S. Hochreiter,et al. DeepTox: Toxicity prediction using deep learning , 2017 .