Software Defect Prediction for Healthcare Big Data: An Empirical Evaluation of Machine Learning Techniques-Reference-Cited by-同舟云学术

Software Defect Prediction for Healthcare Big Data: An Empirical Evaluation of Machine Learning Techniques

Published:2021-03-15 Issue: Volume:2021 Page:1-16
ISSN:2040-2309
Container-title:Journal of Healthcare Engineering
language:en
Short-container-title:Journal of Healthcare Engineering

Author:

Khan Bilal¹,Naseem Rashid²^ORCID,Shah Muhammad Arif²^ORCID,Wakil Karzan³,Khan Atif⁴^ORCID,Uddin M. Irfan⁵^ORCID,Mahmoud Marwan⁶^ORCID

Affiliation:

1. Department of Computer Science, City University of Science and Information Technology, Peshawar 25000, Pakistan

2. Department of IT and Computer Science, Pak-Austria Fachhochschule Institute of Applied Sciences and Technology, Haripur, Pakistan

3. Research Center, Sulaimani Polytechnic University, Sulimani 46001, Kurdistan Region, Iraq

4. Department of Computer Science, Islamia College, Peshawar 2500, Pakistan

5. Institute of Computing Kohat University of Science and Technology, Kohat, Pakistan

6. Faculty of Applied Studies, King Abdulaziz University, Jeddah, Saudi Arabia

Abstract

Software defect prediction (SDP) in the initial period of the software development life cycle (SDLC) remains a critical and important assignment. SDP is essentially studied during few last decades as it leads to assure the quality of software systems. The quick forecast of defective or imperfect artifacts in software development may serve the development team to use the existing assets competently and more effectively to provide extraordinary software products in the given or narrow time. Previously, several canvassers have industrialized models for defect prediction utilizing machine learning (ML) and statistical techniques. ML methods are considered as an operative and operational approach to pinpoint the defective modules, in which moving parts through mining concealed patterns amid software metrics (attributes). ML techniques are also utilized by several researchers on healthcare datasets. This study utilizes different ML techniques software defect prediction using seven broadly used datasets. The ML techniques include the multilayer perceptron (MLP), support vector machine (SVM), decision tree (J48), radial basis function (RBF), random forest (RF), hidden Markov model (HMM), credal decision tree (CDT), K-nearest neighbor (KNN), average one dependency estimator (A1DE), and Naïve Bayes (NB). The performance of each technique is evaluated using different measures, for instance, relative absolute error (RAE), mean absolute error (MAE), root mean squared error (RMSE), root relative squared error (RRSE), recall, and accuracy. The inclusive outcome shows the best performance of RF with 88.32% average accuracy and 2.96 rank value, second-best performance is achieved by SVM with 87.99% average accuracy and 3.83 rank values. Moreover, CDT also shows 87.88% average accuracy and 3.62 rank values, placed on the third position. The comprehensive outcomes of research can be utilized as a reference point for new research in the SDP domain, and therefore, any assertion concerning the enhancement in prediction over any new technique or model can be benchmarked and proved.

Funder

Pak-Austria Fachhochschule: Institute of Applied Sciences and Technology

Publisher

Hindawi Limited

Subject

Health Informatics,Biomedical Engineering,Surgery,Biotechnology

Link

http://downloads.hindawi.com/journals/jhe/2021/8899263.pdf

Reference57 articles.

1. Software Defect Prediction Using Ensemble Learning: An ANP Based Evaluation Method

2. Defect prediction from static code features: current results, limitations, new approaches

3. A Systematic Literature Review on Fault Prediction Performance in Software Engineering