Predicting parallel application performance via machine learning approaches

Consistently growing architectural complexity and machine scales make the creation of accurate performance models for large‐scale applications increasingly challenging. Traditional analytic models are difficult and time consuming to construct, and are often unable to capture full system and application complexity. To address these challenges, we automatically build models based on execution samples. We use multilayer neural networks, because they can represent arbitrary functions and handle noisy inputs robustly. In this paper we focus on two well‐known parallel applications whose variations in execution times are not well understood: SMG 2000, a semicoarsening multigrid solver, and HPL, an open‐source implementation of LINPACK. We sparsely sample performance data on two radically different platforms across large, multidimensional parameter spaces and show that our models based on these data can predict performance within 2% to 7% of actual application runtimes. Copyright © 2007 John Wiley & Sons, Ltd.

[1]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[2]  Foster J. Provost,et al.  Active Learning for Class Probability Estimation and Ranking , 2001, IJCAI.

[3]  John M. Mellor-Crummey,et al.  Cross-architecture performance predictions for scientific applications using parameterized models , 2004, SIGMETRICS '04/Performance '04.

[4]  James E. Smith,et al.  A first-order superscalar processor model , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..

[5]  Sally A. McKee,et al.  Methods of inference and learning for performance modeling of parallel applications , 2007, PPoPP.

[6]  Rich Caruana,et al.  Overfitting in Neural Nets: Backpropagation, Conjugate Gradient, and Early Stopping , 2000, NIPS.

[7]  Hsien-Hsin S. Lee,et al.  Constructing a Non-Linear Model with Neural Networks for Workload Characterization , 2006, 2006 IEEE International Symposium on Workload Characterization.

[8]  Fabrizio Petrini,et al.  Predictive Performance and Scalability Modeling of a Large-Scale Application , 2001, ACM/IEEE SC 2001 Conference (SC'01).

[9]  Sally A. McKee,et al.  Predicting parallel application performance via machine learning approaches: Research Articles , 2007 .

[10]  Foster J. Provost,et al.  Active Sampling for Class Probability Estimation and Ranking , 2004, Machine Learning.

[11]  Sally A. McKee,et al.  Efficiently exploring architectural design spaces via predictive modeling , 2006, ASPLOS XII.

[12]  Frank Mueller,et al.  Cross-Platform Performance Prediction of Parallel Applications Using Partial Execution , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[13]  Michael F. P. O'Boyle,et al.  Fast compiler optimisation evaluation using code-feature based performance prediction , 2007, CF '07.

[14]  N. S. Barnett,et al.  Private communication , 1969 .

[15]  Robert D. Falgout,et al.  Semicoarsening Multigrid on Distributed Memory Machines , 1999, SIAM J. Sci. Comput..

[16]  Sally A. McKee,et al.  Efficient architectural design space exploration via predictive modeling , 2008, TACO.

[17]  Laura Carrington,et al.  Applying an Automated Framework to Produce Accurate Blind Performance Predictions of Full-Scale HPC Applications , 2004 .

[18]  William Gropp,et al.  Design and implementation of message-passing services for the Blue Gene/L supercomputer , 2005, IBM J. Res. Dev..

[19]  Sally A. McKee,et al.  An Approach to Performance Prediction for Parallel Applications , 2005, Euro-Par.

[20]  Laura Carrington,et al.  A performance prediction framework for scientific applications , 2003, Future Gener. Comput. Syst..

[21]  Martin A. Riedmiller,et al.  A direct adaptive method for faster backpropagation learning: the RPROP algorithm , 1993, IEEE International Conference on Neural Networks.

[22]  Robert D. Falgout,et al.  hypre: A Library of High Performance Preconditioners , 2002, International Conference on Computational Science.