论文信息 - Quantifying Point-Prediction Uncertainty in Neural Networks via Residual Estimation with an I/O Kernel

Quantifying Point-Prediction Uncertainty in Neural Networks via Residual Estimation with an I/O Kernel

Neural Networks (NNs) have been extensively used for a wide spectrum of real-world regression tasks, where the goal is to predict a numerical outcome such as revenue, effectiveness, or a quantitative result. In many such tasks, the point prediction is not enough: the uncertainty (i.e. risk or confidence) of that prediction must also be estimated. Standard NNs, which are most often used in such tasks, do not provide uncertainty information. Existing approaches address this issue by combining Bayesian models with NNs, but these models are hard to implement, more expensive to train, and usually do not predict as accurately as standard NNs. In this paper, a new framework (RIO) is developed that makes it possible to estimate uncertainty in any pretrained standard NN. The behavior of the NN is captured by modeling its prediction residuals with a Gaussian Process, whose kernel includes both the NN's input and its output. The framework is evaluated in twelve real-world datasets, where it is found to (1) provide reliable estimates of uncertainty, (2) reduce the error of the point predictions, and (3) scale well to large datasets. Given that RIO can be applied to any standard NN without modifications to model architecture or training pipeline, it provides an important ingredient for building real-world NN applications.

Elliot Meyerson | Risto Miikkulainen | Xin Qiu

[1] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2] Tomoharu Iwata,et al. Improving Output Uncertainty Estimation and Generalization in Deep Learning via Neural Network Gaussian Processes , 2017, 1707.05922.

[3] Luc Van Gool,et al. Deep Expectation of Real and Apparent Age from a Single Image Without Facial Landmarks , 2018, International Journal of Computer Vision.

[4] Peter Sollich,et al. Learning Curves for Gaussian Processes , 1998, NIPS.

[5] Jaehoon Lee,et al. Deep Neural Networks as Gaussian Processes , 2017, ICLR.

[6] Yee Whye Teh,et al. Attentive Neural Processes , 2019, ICLR.

[7] Zoubin Ghahramani,et al. Probabilistic machine learning and artificial intelligence , 2015, Nature.

[8] Yee Whye Teh,et al. Conditional Neural Processes , 2018, ICML.

[9] Geoffrey E. Hinton,et al. Keeping the neural networks simple by minimizing the description length of the weights , 1993, COLT '93.

[10] James Hensman,et al. Scalable Variational Gaussian Process Classification , 2014, AISTATS.

[11] Seyed Taghi Akhavan Niaki,et al. Forecasting S&P 500 index using artificial neural networks and design of experiments , 2013 .

[12] Emmanuel Abbe,et al. Provable limitations of deep learning , 2018, ArXiv.

[13] J. Nocedal,et al. A Limited Memory Algorithm for Bound Constrained Optimization , 1995, SIAM J. Sci. Comput..