Software Defect Prediction: A Comparison Between Artificial Neural Network and Support Vector Machine

Software industry has stipulated the need for good quality software projects to be delivered on time and within budget. Software defect prediction (SDP) has led to the application of machine learning algorithms for building defect classification models us

  • PDF / 172,831 Bytes
  • 11 Pages / 439.37 x 666.142 pts Page_size
  • 111 Downloads / 247 Views

DOWNLOAD

REPORT


Abstract Software industry has stipulated the need for good quality software projects to be delivered on time and within budget. Software defect prediction (SDP) has led to the application of machine learning algorithms for building defect classification models using software metrics and defect proneness as the independent and dependent variables, respectively. This work performs an empirical comparison of the two classification methods: support vector machine (SVM) and artificial neural network (ANN), both having the predictive capability to handle the complex nonlinear relationships between the software attributes and the software defect. Seven data sets from the PROMISE repository are used and the prediction models’ are assessed on the parameters of accuracy, recall, and specificity. The results show that SVM is better than ANN in terms of recall, while the later one performed well along the dimensions of accuracy and specificity. Therefore, it is concluded that it is necessary to determine the evaluation parameters according to the criticality of the project, and then decide upon the classification model to be applied.



Keywords Back propagation Supervised learning Software quality Support vector machine



 Artificial neural networks 

I. Arora (&) Northern India Engineering College, FC-26, Shastri Park, Delhi 110053, India e-mail: [email protected] A. Saha University School of Information and Communication Technology, Guru Gobind Singh Indraprastha University, Sector-16C, Dwarka, Delhi 110078, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2018 R.K. Choudhary et al. (eds.), Advanced Computing and Communication Technologies, Advances in Intelligent Systems and Computing 562, https://doi.org/10.1007/978-981-10-4603-2_6

51

52

I. Arora and A. Saha

1 Introduction Software defect prediction (SDP) is used to assess the software quality by building the software defect prediction models using the static software design and code metrics. SDP is usually treated as a binary classification problem where in a module is either defect prone or nondefect prone. A module is a single indivisible unit of the source code, which may be a class or a procedure, having a set of object-oriented features such as Chidamber and Kemerer metrics [1] or procedural metrics such as Halstead metrics [2], respectively. A software defect prediction model is built through a training phase using the labeled historical defect data and then, a trained model acts as a classifier for the new unknown data. The classification performance of the SDP model is examined along various parameters such as accuracy, sensitivity, and specificity. Researchers and practitioners have successfully applied the statistical techniques, such as logistic regression [3], and machine learning techniques, specifically supervised learning methods, such as decision trees [4], Naïve Bayes [4], and support vector machines [5], as a solution to the SDP problem. The dependence of the statistical methods on the characteristics of the data set proved t