U NCERTAINTY S ETS FOR I MAGE C LASSIFIERS USING C ONFORMAL P REDICTION

Convolutional image classifiers can achieve high predictive accuracy, but quantifying their uncertainty remains an unresolved challenge, hindering their deployment in consequential settings. Existing uncertainty quantification techniques, such as Platt scaling, attempt to calibrate the network’s probability estimates, but they do not have formal guarantees. We present an algorithm that modifies any classifier to output a predictive set containing the true label with a user-specified probability, such as 90%. The algorithm is simple and fast like Platt scaling, but provides a formal finite-sample coverage guarantee for every model and dataset. Our method modifies an existing conformal prediction algorithm to give more stable predictive sets by regularizing the small scores of unlikely classes after Platt scaling. In experiments on both Imagenet and Imagenet-V2 with ResNet-152 and other classifiers, our scheme outperforms existing approaches, achieving coverage with sets that are often factors of 5 to 10 smaller than a stand-alone Platt scaling baseline.

[1]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[2]  Alexander Gammerman,et al.  Machine-Learning Applications of Algorithmic Randomness , 1999, ICML.

[3]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[4]  Harris Papadopoulos,et al.  Inductive Confidence Machines for Regression , 2002, ECML.

[5]  Vladimir Vovk,et al.  Mondrian Confidence Machine , 2003 .

[6]  Carl E. Rasmussen,et al.  Evaluating Predictive Uncertainty Challenge , 2005, MLCW.

[7]  W. Gasarch,et al.  The Book Review Column 1 Coverage Untyped Systems Simple Types Recursive Types Higher-order Systems General Impression 3 Organization, and Contents of the Book , 2022 .

[8]  Peter Cheeseman,et al.  Bayesian Methods for Adaptive Models , 2011 .

[9]  Vladimir Vovk,et al.  Conditional validity of inductive conformal predictors , 2012, Machine Learning.

[10]  Larry Wasserman,et al.  Distribution‐free prediction bands for non‐parametric regression , 2014 .

[11]  Xiaogang Wang,et al.  Medical image classification with convolutional neural network , 2014, 2014 13th International Conference on Control Automation Robotics & Vision (ICARCV).

[12]  Gang Hua,et al.  A convolutional neural network cascade for face detection , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Yarin Gal,et al.  Uncertainty in Deep Learning , 2016 .

[14]  Muhammad Imran Razzak,et al.  Deep Learning for Medical Image Processing: Overview, Challenges and Future , 2017, ArXiv.

[15]  Charles Blundell,et al.  Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles , 2016, NIPS.

[16]  Chong Zhang,et al.  On Reject and Refine Options in Multicategory Classification , 2017, 1701.02265.

[17]  Kilian Q. Weinberger,et al.  On Calibration of Modern Neural Networks , 2017, ICML.

[18]  Murat Sensoy,et al.  Evidential Deep Learning to Quantify Classification Uncertainty , 2018, NeurIPS.

[19]  Jasper Snoek,et al.  Deep Bayesian Bandits Showdown: An Empirical Comparison of Bayesian Deep Networks for Thompson Sampling , 2018, ICLR.

[20]  Stefano Ermon,et al.  Accurate Uncertainties for Deep Learning Using Calibrated Regression , 2018, ICML.

[21]  Mohamed Zaki,et al.  High-Quality Prediction Intervals for Deep Learning: A Distribution-Free, Ensembled Approach , 2018, ICML.

[22]  Maya R. Gupta,et al.  To Trust Or Not To Trust A Classifier , 2018, NeurIPS.

[23]  Larry A. Wasserman,et al.  Least Ambiguous Set-Valued Classifiers With Bounded Error Levels , 2016, Journal of the American Statistical Association.

[24]  Alessandro Rinaldo,et al.  Distribution-Free Predictive Inference for Regression , 2016, Journal of the American Statistical Association.

[25]  Barnabás Póczos,et al.  Cautious Deep Learning , 2018, ArXiv.

[26]  Leying Guan,et al.  Prediction and outlier detection in classification problems , 2019, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[27]  Arvid Lundervold,et al.  An overview of deep learning in medical imaging focusing on MRI , 2018, Zeitschrift fur medizinische Physik.

[28]  Jeremy Nixon,et al.  Measuring Calibration in Deep Learning , 2019, CVPR Workshops.

[29]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[30]  Yaniv Romano,et al.  Conformalized Quantile Regression , 2019, NeurIPS.

[31]  Emmanuel J. Candès,et al.  Conformal Prediction Under Covariate Shift , 2019, NeurIPS.

[32]  Yaniv Romano,et al.  Classification with Valid and Adaptive Coverage , 2020, NeurIPS.

[33]  Sébastien Destercke,et al.  Deep Conformal Prediction for Robust Models , 2020, IPMU.

[34]  PAC Confidence Sets for Deep Neural Networks via Calibrated Prediction , 2019, ICLR.

[35]  John C. Duchi,et al.  Knowing what You Know: valid and validated confidence sets in multiclass and multilabel prediction , 2020, J. Mach. Learn. Res..

[36]  Regina Barzilay,et al.  Efficient Conformal Prediction via Cascaded Inference with Expanded Admission , 2021, International Conference on Learning Representations.

[37]  Nested conformal prediction and quantile out-of-bag ensemble methods , 2019, Pattern Recognition.