暂无分享,去创建一个
Ziheng Jiang | Michael C. Mozer | Kunal Talwar | Chiyuan Zhang | M. Mozer | Chiyuan Zhang | Kunal Talwar | Ziheng Jiang
[1] Xu Sun,et al. Adaptive Gradient Methods with Dynamic Bound of Learning Rate , 2019, ICLR.
[2] Mikhail Belkin,et al. To understand deep learning we need to understand kernel learning , 2018, ICML.
[3] Vitaly Feldman,et al. Does learning require memorization? a short tale about a long tail , 2019, STOC.
[4] Jason Weston,et al. Curriculum learning , 2009, ICML '09.
[5] Yoshua Bengio,et al. An Empirical Study of Example Forgetting during Deep Neural Network Learning , 2018, ICLR.
[6] Mikhail Belkin,et al. Overfitting or perfect fitting? Risk bounds for classification and regression rules that interpolate , 2018, NeurIPS.
[7] Tengyuan Liang,et al. Just Interpolate: Kernel "Ridgeless" Regression Can Generalize , 2018, The Annals of Statistics.
[8] Arthur Jacot,et al. Neural tangent kernel: convergence and generalization in neural networks (invited paper) , 2018, NeurIPS.
[9] Vinay Uday Prabhu,et al. Do deep neural networks learn shallow learnable examples first , 2019 .
[10] Bernd Bischl,et al. Robust Anomaly Detection in Images using Adversarial Autoencoders , 2019, ECML/PKDD.
[11] Nathan Srebro,et al. The Marginal Value of Adaptive Gradient Methods in Machine Learning , 2017, NIPS.
[12] Nicholas Carlini,et al. Prototypical Examples in Deep Learning: Metrics, Characteristics, and Utility , 2018 .
[13] Arthur Zimek,et al. On the evaluation of unsupervised outlier detection: measures, datasets, and an empirical study , 2016, Data Mining and Knowledge Discovery.
[14] Sridhar Ramaswamy,et al. Efficient algorithms for mining outliers from large data sets , 2000, SIGMOD '00.
[15] Quoc V. Le,et al. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks , 2019, ICML.
[16] Mikhail Belkin,et al. Two models of double descent for weak features , 2019, SIAM J. Math. Data Sci..
[17] Hans-Peter Kriegel,et al. LOF: identifying density-based local outliers , 2000, SIGMOD '00.
[18] Richard Socher,et al. Improving Generalization Performance by Switching from Adam to SGD , 2017, ArXiv.