Benchmarking: state-of-the-art and beyond

Benchmarking is a compulsory task to assess the performance of a (new) optimization algorithm. While this appears as a mainly technical task, there are surprisingly many methodological problems that arise when benchmarking algorithms. Over the past decade, there was a great effort towards improving the benchmarking methodology for (gradient-free) optimization. It was started for continuous optimization problems and then extended to multi-objective and mix-integer problems. In this tutorial, we will present and discuss these key methodological ideas emphasizing the importance of quantitative measurement, the use of instances of problems as well as choosing well the testbed to not bias results towards too easy problems. We will particularly review the advantages of presenting data using the empirical cumulative distribution of runtimes, a tool that everyone assessing the performance of an algorithm should know. We will then review how this methodology is implemented within the COCO software and show how COCO can and should be used to benchmark an algorithm and write a scientific paper. This tutorial is intended for young researchers starting in the field who need benchmarking for their research as well as for researchers that wish to get up-to-date with the latest methodological developments in benchmarking methodology.