An empirical study of ensemble techniques for software fault prediction

PDF / 2,248,388 Bytes
30 Pages / 595.224 x 790.955 pts Page_size
4 Downloads / 227 Views

An empirical study of ensemble techniques for software fault prediction Santosh S. Rathore1 · Sandeep Kumar2 Accepted: 9 September 2020 © Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract Previously, many researchers have performed analysis of various techniques for the software fault prediction (SFP). Oddly, the majority of such studies have shown the limited prediction capability and their performance for given software fault datasets was not persistent. In contrast to this, recently, ensemble techniques based SFP models have shown promising and improved results across different software fault datasets. However, many new as well as improved ensemble techniques have been introduced, which are not explored for SFP. Motivated by this, the paper performs an investigation on ensemble techniques for SFP. We empirically assess the performance of seven ensemble techniques namely, Dagging, Decorate, Grading, MultiBoostAB, RealAdaBoost, Rotation Forest, and Ensemble Selection. We believe that most of these ensemble techniques are not used before for SFP. We conduct a series of experiments on the benchmark fault datasets and use three distinct classification algorithms, namely, naive Bayes, logistic regression, and J48 (decision tree) as base learners to the ensemble techniques. Experimental analysis revealed that rotation forest with J48 as the base learner achieved the highest precision, recall, and G-mean 1 values of 0.995, 0.994, and 0.994, respectively and Decorate achieved the highest AUC value of 0.986. Further, results of statistical tests showed used ensemble techniques demonstrated a statistically significant difference in their performance among the used ones for SFP. Additionally, the cost-benefit analysis showed that SFP models based on used ensemble techniques might be helpful in saving software testing cost and effort for twenty out of twenty-eight used fault datasets. Keywords Software fault prediction · Ensemble techniques · PROMISE data repository · Empirical analysis

1 Introduction Current software systems are growing rapidly in complexity and size, thus, ensuring their reliability and quality are paramount important, which depends on software faults [1]. Software fault prediction (SFP) actively helps in the detection of faults by highlighting potential faulty areas of code in the software system [2]. This identification of

Santosh S. Rathore

[email protected] Sandeep Kumar [email protected] 1

Department of Information Technology, ABV-Indian Institute of Information Technology and Management, Gwalior, India

2

Department of Computer Science and Engineering, Indian Institute of Technology Roorkee, Roorkee, India

areas of code liable to more faults can help the testing team to allot software quality assurance resources optimally and efficiently [3, 4]. SFP modeling has been examined widely by several researchers due to its inherent advantages in optimizing testing resources utilization and improving the quality of software projects [5–7]. For the last two

Data Loading...

An empirical study of ensemble techniques for software fault prediction

Recommend Documents

Software Fault Prediction Using Machine-Learning Techniques

An Empirical Study to Investigate Different SMOTE Data Sampling Techniques for Improving Software Refactoring Prediction

Automated Essay Grading: An Empirical Analysis of Ensemble Learning Techniques

Empirical Investigation of Metrics for Fault Prediction on Object-Oriented Software

SMOTE-Based Homogeneous Ensemble Methods for Software Defect Prediction

Software Fault Prediction Using Random Forests

Software Fault Prediction A Road Map

Comparative evaluation of pattern mining techniques: an empirical study

Fault Prediction Modeling for the Prediction of Number of Software Faults

Software Quality Prediction Using Machine Learning Techniques

Modelling the Effects of Combining Diverse Software Fault Detection Techniques

An Explainable Artificial Intelligence Methodology for Hard Disk Fault Prediction