Optimizing loss functions through multi-variate taylor polynomial parameterization
暂无分享,去创建一个
[1] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .
[2] Yike Guo,et al. Semantic Image Synthesis via Adversarial Learning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[3] Elliot Meyerson,et al. Evolving Deep Neural Networks , 2017, Artificial Intelligence in the Age of Neural Networks and Brain Computing.
[4] Yaochu Jin,et al. Surrogate-assisted evolutionary computation: Recent advances and future challenges , 2011, Swarm Evol. Comput..
[5] Jorge Nocedal,et al. On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima , 2016, ICLR.
[6] Nikolaus Hansen,et al. Completely Derandomized Self-Adaptation in Evolution Strategies , 2001, Evolutionary Computation.
[7] Welch Bl. THE GENERALIZATION OF ‘STUDENT'S’ PROBLEM WHEN SEVERAL DIFFERENT POPULATION VARLANCES ARE INVOLVED , 1947 .
[8] Nikos Komodakis,et al. Wide Residual Networks , 2016, BMVC.
[9] Yann Dauphin,et al. Empirical Analysis of the Hessian of Over-Parametrized Neural Networks , 2017, ICLR.
[10] Graham W. Taylor,et al. Improved Regularization of Convolutional Neural Networks with Cutout , 2017, ArXiv.
[11] Wolfgang Banzhaf,et al. Genetic Programming: An Introduction , 1997 .
[12] Andrew Y. Ng,et al. Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .
[13] Nitish Srivastava,et al. Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.
[14] John J. Grefenstette,et al. Genetic Search with Approximate Function Evaluation , 1985, ICGA.
[15] Alexandre Lacoste,et al. Quantifying the Carbon Emissions of Machine Learning , 2019, ArXiv.
[16] Hao Li,et al. Visualizing the Loss Landscape of Neural Nets , 2017, NeurIPS.
[17] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[18] Thomas Brox,et al. Striving for Simplicity: The All Convolutional Net , 2014, ICLR.
[19] A Tikhonov,et al. Solution of Incorrectly Formulated Problems and the Regularization Method , 1963 .
[20] P. Graves-Morris. The numerical calculation of Padé approximants , 1979 .
[21] Kristen Grauman,et al. 2.5D Visual Sound , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[22] Jian Sun,et al. Identity Mappings in Deep Residual Networks , 2016, ECCV.
[23] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .
[24] D. E. Roberts,et al. Calculation of Canterbury approximants , 1984 .
[25] Yan Pan,et al. Modelling Sentence Pairs with Tree-structured Attentive Encoder , 2016, COLING.
[26] Seong Joon Oh,et al. CutMix: Regularization Strategy to Train Strong Classifiers With Localizable Features , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[27] Risto Miikkulainen,et al. Faster Training by Selecting Samples Using Embeddings , 2019, 2019 International Joint Conference on Neural Networks (IJCNN).
[28] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[29] P. J. Huber. Robust Estimation of a Location Parameter , 1964 .
[30] Yuan Yu,et al. TensorFlow: A system for large-scale machine learning , 2016, OSDI.
[31] Frank Hutter,et al. Neural Architecture Search: A Survey , 2018, J. Mach. Learn. Res..
[32] Nikolaus Hansen,et al. Evaluating the CMA Evolution Strategy on Multimodal Test Functions , 2004, PPSN.
[33] Risto Miikkulainen,et al. Improved Training Speed, Accuracy, and Data Utilization Through Loss Function Optimization , 2019, 2020 IEEE Congress on Evolutionary Computation (CEC).
[34] Junmo Kim,et al. Deep Pyramidal Residual Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[35] G. Ruxton. The unequal variance t-test is an underused alternative to Student's t-test and the Mann–Whitney U test , 2006 .
[36] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[37] Bogdan Gabrys,et al. Metalearning: a survey of trends and technologies , 2013, Artificial Intelligence Review.
[38] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[39] Stefano Soatto,et al. Time Matters in Regularizing Deep Networks: Weight Decay and Data Augmentation Affect Early Learning Dynamics, Matter Little Near Convergence , 2019, NeurIPS.
[40] Nikolaus Hansen,et al. Adapting arbitrary normal mutation distributions in evolution strategies: the covariance matrix adaptation , 1996, Proceedings of IEEE International Conference on Evolutionary Computation.
[41] Risto Miikkulainen,et al. Evolutionary optimization of deep learning activation functions , 2020, GECCO.
[42] J. Chisholm. Rational approximants defined from double power series , 1973 .
[43] Chen Liang,et al. AutoML-Zero: Evolving Machine Learning Algorithms From Scratch , 2020, ICML.
[44] Leslie N. Smith,et al. Cyclical Learning Rates for Training Neural Networks , 2015, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).
[45] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .