暂无分享,去创建一个
[1] R. Douglas,et al. Neuronal circuits of the neocortex. , 2004, Annual review of neuroscience.
[2] Ohad Shamir,et al. Implicit Regularization in ReLU Networks with the Square Loss , 2020, COLT.
[3] Qianli Liao,et al. Theoretical issues in deep networks , 2020, Proceedings of the National Academy of Sciences.
[4] Tomaso Poggio,et al. Loss landscape: SGD has a better view , 2020 .
[5] Amit Daniely,et al. The Implicit Bias of Depth: How Incremental Learning Drives Generalization , 2020, ICLR.
[6] Tim Salimans,et al. Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks , 2016, NIPS.
[7] Mikhail Belkin,et al. Evaluation of Neural Architectures Trained with Square Loss vs Cross-Entropy in Classification Tasks , 2020, ICLR.
[8] Tomaso Poggio,et al. Loss landscape: SGD can have a better view than GD , 2020 .
[9] Sanjeev Arora,et al. Theoretical Analysis of Auto Rate-Tuning by Batch Normalization , 2018, ICLR.
[10] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[11] David L. Donoho,et al. Prevalence of neural collapse during the terminal phase of deep learning training , 2020, Proceedings of the National Academy of Sciences.
[12] Tomaso Poggio,et al. Generalization in deep network classifiers trained with the square loss1 , 2020 .
[13] Nathan Srebro,et al. Lexicographic and Depth-Sensitive Margins in Homogeneous and Non-Homogeneous Deep Models , 2019, ICML.
[14] Kaifeng Lyu,et al. Gradient Descent Maximizes the Margin of Homogeneous Neural Networks , 2019, ICLR.