GenMAT: A General-Purpose Machine Learning-Driven Auto-Tuner for Heterogeneous Platforms

As computing platforms evolve with heterogeneous resources, developing optimized code that fully exploits the computing power becomes increasingly challenging. Domain experts need extensive knowledge of computer architecture, compiler optimizations, and parallel computing to understand which implementation will work best for their problem domain and data. Even with considerable time learning, writing, and debugging high-performance code, such optimizations may not generalize to different inputs, applications, or computing platforms. To assist the end-users in optimally deploying workloads on the heterogeneous environment with high productivity, a fundamental problem is to automatically find the best "variant" of an application—the implementation with the optimal configurations on the most suitable hardware resource resulting in the minimum runtime. We propose GenMAT, a portable tool for identifying the best variant of any application specified as a meta-program with exposed tunable parameters on any hardware. GenMAT automatically profiles the application by varying the exposed tunable parameters to generate a small set of profiling data. Then, GenMAT trains a compact machine learning model that is used to quickly predict the runtimes of a large number of possible parameter settings to identify the best variant. We show that the variant selected by GenMAT has a runtime deviation within 3.5% of the true best variant in determining the best linear algebra library for matrix operations. For identifying the best Halide schedule, GenMAT correctly ranks the runtimes of thousands of candidates with an average Spearman’s rank correlation coefficient of 0.95.