Statistical debugging: simultaneous identification of multiple bugs

We describe a statistical approach to software debugging in the presence of multiple bugs. Due to sparse sampling issues and complex interaction between program predicates, many generic off-the-shelf algorithms fail to select useful bug predictors. Taking inspiration from bi-clustering algorithms, we propose an iterative collective voting scheme for the program runs and predicates. We demonstrate successful debugging results on several real world programs and a large debugging benchmark suite.

[1]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[2]  Michael I. Jordan,et al.  Statistical Debugging of Sampled Programs , 2003, NIPS.

[3]  Thomas J. Ostrand,et al.  Experiments on the effectiveness of dataflow- and control-flow-based test adequacy criteria , 1994, Proceedings of 16th International Conference on Software Engineering.

[4]  Michael I. Jordan,et al.  Bug isolation via remote program sampling , 2003, PLDI '03.

[5]  Steven P. Reiss,et al.  Fault localization with nearest neighbor queries , 2003, 18th IEEE International Conference on Automated Software Engineering, 2003. Proceedings..

[6]  J. Hartigan Direct Clustering of a Data Matrix , 1972 .

[7]  Michael I. Jordan,et al.  Scalable statistical bug isolation , 2005, PLDI '05.

[8]  Inderjit S. Dhillon,et al.  Co-clustering documents and words using bipartite spectral graph partitioning , 2001, KDD '01.

[9]  H. Cleve,et al.  Locating causes of program failures , 2005, Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005..

[10]  Chao Liu,et al.  SOBER: statistical model-based bug localization , 2005, ESEC/FSE-13.