Affiliation:
1. Department of CSE, National Institute of Technology Jamshedpur, India
2. Thapar Institute of Engineering & Technology, Patiala, Punjab, India
3. Indian Institute of Information Technology Sonepat, India
Abstract
Software fault prediction, which aims to find and fix probable flaws before they appear in real-world settings, is an essential component of software quality assurance. This article provides a thorough analysis of the use of feature ranking algorithms for successful software failure prediction. In order to choose and prioritise the software metrics or qualities most important to fault prediction models, feature ranking approaches are essential. The proposed focus on applying an ensemble feature ranking algorithm to a specific software fault dataset, addressing the challenge posed by the dataset’s high dimensionality. In this extensive study, we examined the effectiveness of multiple machine learning classifiers on six different software projects: jedit, ivy, prop, xerces, tomcat, and poi, utilising feature selection strategies. In order to evaluate classifier performance under two scenarios—one with the top 10 features and another with the top 15 features—our study sought to determine the most relevant features for each project. SVM consistently performed well across the six datasets, achieving noteworthy results like 98.74% accuracy on “jedit” (top 10 features) and 91.88% on “tomcat” (top 10 features). Random Forest achieving 89.20% accuracy on the top 15 features, on “ivy.” In contrast, NB repeatedly recording the lowest accuracy rates, such as 51.58% on “poi” and 50.45% on “xerces” (the top 15 features). These findings highlight SVM and RF as the top performers, whereas NB was consistently the least successful classifier. The findings suggest that the choice of feature ranking algorithm has a substantial impact on the fault prediction models’ predictive accuracy and effectiveness. When using various ranking systems, the research also analyses the trade-offs between computing complexity and forecast accuracy.