A Learning-to-Rank Algorithm for Constructing Defect Prediction Models

This paper applies the learning-to-rank approach to software defect prediction. Ranking software modules in order of defect-proneness is important to ensure that testing resources are allocated efficiently. However, prediction models that are optimized for predicting explicitly the number of defects often fail to correctly predict rankings based on those defect numbers. We show in this paper that the model construction methods, which include the ranking performance measure in the objective function, perform better in predicting defect-proneness rankings of multiple modules. We present the experimental results, in which our method is compared against three other methods from the literature, using five publicly available data sets.

[1]  Elaine J. Weyuker,et al.  Where the bugs are , 2004, ISSTA '04.

[2]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[3]  Elaine J. Weyuker,et al.  Automating algorithms for the identification of fault-prone files , 2007, ISSTA '07.

[4]  Giovanni Denaro,et al.  An empirical evaluation of fault-proneness models , 2002, ICSE '02.

[5]  Xiaodong Li,et al.  Cooperatively Coevolving Particle Swarms for Large Scale Optimization , 2012, IEEE Transactions on Evolutionary Computation.

[6]  Elaine J. Weyuker,et al.  Comparing negative binomial and recursive partitioning models for fault prediction , 2008, PROMISE '08.

[7]  Qingfu Zhang,et al.  Differential Evolution With Composite Trial Vector Generation Strategies and Control Parameters , 2011, IEEE Transactions on Evolutionary Computation.

[8]  Wei Li,et al.  A stochastic learning-to-rank algorithm and its application to contextual advertising , 2011, WWW.

[9]  Elaine J. Weyuker,et al.  Predicting the location and number of faults in large software systems , 2005, IEEE Transactions on Software Engineering.

[10]  Witold Pedrycz,et al.  Practical assessment of the models for identification of defect-prone classes in object-oriented commercial systems using design metrics , 2003, J. Syst. Softw..

[11]  Lionel C. Briand,et al.  Empirical Studies of Quality Models in Object-Oriented Systems , 2002, Adv. Comput..

[12]  Harvey P. Siy,et al.  Predicting Fault Incidence Using Software Change History , 2000, IEEE Trans. Software Eng..

[13]  Rainer Storn,et al.  Differential Evolution – A Simple and Efficient Heuristic for global Optimization over Continuous Spaces , 1997, J. Glob. Optim..

[14]  Michele Lanza,et al.  Evaluating defect prediction approaches: a benchmark and an extensive comparison , 2011, Empirical Software Engineering.

[15]  Niclas Ohlsson,et al.  Predicting Fault-Prone Software Modules in Telephone Switches , 1996, IEEE Trans. Software Eng..

[16]  Elaine J. Weyuker,et al.  Looking for bugs in all the right places , 2006, ISSTA '06.

[17]  Xin Yao,et al.  Multi-Objective Approaches to Optimal Testing Resource Allocation in Modular Software Systems , 2010, IEEE Transactions on Reliability.

[18]  Xin Yao,et al.  Scalability of generalized adaptive differential evolution for large-scale continuous optimization , 2010, Soft Comput..

[19]  Lionel C. Briand,et al.  Predicting fault-prone components in a java legacy system , 2006, ISESE '06.

[20]  Elaine J. Weyuker,et al.  Comparing the effectiveness of several modeling methods for fault prediction , 2010, Empirical Software Engineering.