Optimization of Deep Learning Precipitation Models Using Categorical Binary Metrics

This work introduces a methodology for optimizing neural network models using a combination of continuous and categorical binary indices in the context of precipitation forecasting. Probability of detection or false alarm rate are popular metrics used in the verification of precipitation models. However, machine learning models trained using gradient descent cannot be optimized based on these metrics, as they are not differentiable. We propose an alternative formulation for these categorical indices that are differentiable and we demonstrate how they can be used to optimize the skill of precipitation neural network models defined as a multi-objective optimization problem. To our knowledge, this is the first proposal of a methodology for optimizing weather neural network models based on categorical indices. Plain Language Summary Deep neural networks have recently demonstrated great versatility and an unprecedented capacity to model complex problems. In weather modeling, these algorithms have been applied to solve different problems. This is a promising area of research, given the availability of large volumes of weather data and increasingly powerful computers. Neural network models can learn to solve problems based on a metric, which the model tries to optimize. However, the quality of weather models is measured using a large variety of metrics, which can be a challenge when choosing which metric the model should optimize. In the case of precipitation, categorical binary metrics are a popular choice to asses the quality of a model. These metrics reduce precipitation to a ’yes’ or ’no’ event and the results of the predicting model can be compared with the actual observations. This method is simple, yet powerful and a large number of indices and statistics have been developed to assess different aspects of the quality of precipitation models. As precipitation models are commonly assessed using these categorical binary metrics, it would be very convenient to optimize models based on them. Unfortunately, the mathematical nature of these metrics makes them unsuitable for optimizing deep learning models. In this work we present an alternative formulation for these categorical binary indices which can be used to train models. We demonstrate how a deep learning model can be trained to generate better quality precipitation data.

[1]  C. Marzban The ROC Curve and the Area under It as Performance Measures , 2004 .

[2]  Barbara G. Brown,et al.  Forecast verification: current status and future directions , 2008 .

[3]  Michael C. Mozer,et al.  Optimizing Classifier Performance via an Approximation to the Wilcoxon-Mann-Whitney Statistic , 2003, ICML.

[4]  James B. Elsner,et al.  Weather and Forecasting , 2007 .

[5]  Antti Mäkelä,et al.  Evaluation of Machine Learning Classifiers for Predicting Deep Convection , 2019, Journal of Advances in Modeling Earth Systems.

[6]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[7]  Prabhat,et al.  Application of Deep Convolutional Neural Networks for Detecting Extreme Weather in Climate Datasets , 2016, ArXiv.

[8]  Peter Bauer,et al.  Challenges and design choices for global weather and climate models based on machine learning , 2018, Geoscientific Model Development.

[9]  I. Jolliffe,et al.  Forecast verification : a practitioner's guide in atmospheric science , 2011 .

[10]  Sebastian Scher,et al.  Weather and climate forecasting with neural networks: using general circulation models (GCMs) with different complexity as a study ground , 2019, Geoscientific Model Development.

[11]  Sebastian Scher,et al.  Toward Data‐Driven Weather and Climate Forecasting: Approximating a Simple General Circulation Model With Deep Learning , 2018, Geophysical Research Letters.

[12]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[13]  T. Wardah,et al.  Statistical verification of numerical weather prediction models for quantitative precipitation forecast , 2011, 2011 IEEE Colloquium on Humanities, Science and Engineering.

[14]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[15]  Francis W. Zwiers,et al.  On the ROC score of probability forecasts , 2003 .

[16]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[17]  Paul Poli,et al.  The ERA-Interim archive, version 2.0 , 2011 .

[18]  Bhavani Raskutti,et al.  Optimising area under the ROC curve using gradient descent , 2004, ICML.

[19]  A. H. Murphy,et al.  Skill Scores Based on the Mean Square Error and Their Relationships to the Correlation Coefficient , 1988 .

[20]  J. Thepaut,et al.  The ERA‐Interim reanalysis: configuration and performance of the data assimilation system , 2011 .

[21]  Susan Joslyn,et al.  Progress and challenges in forecast verification , 2013 .

[22]  Jonathan A. Weyn,et al.  Can Machines Learn to Predict Weather? Using Deep Learning to Predict Gridded 500‐hPa Geopotential Height From Historical Weather Data , 2019, Journal of Advances in Modeling Earth Systems.

[23]  François Chollet,et al.  Keras: The Python Deep Learning library , 2018 .

[24]  J. McBride,et al.  Verification of Quantitative Precipitation Forecasts from Operational Numerical Weather Prediction Models over Australia , 2000 .

[25]  D. Stephenson Use of the “Odds Ratio” for Diagnosing Forecast Skill , 2000 .

[26]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  A. Speranza,et al.  Sensitivity of Precipitation Forecast Skill Scores to Bilinear Interpolation and a Simple Nearest-Neighbor Average Method on High-Resolution Verification Grids , 2003 .