Semantic and traditional feature fusion for software defect prediction using hybrid deep learning model-Reference-Cited by-同舟云学术

Semantic and traditional feature fusion for software defect prediction using hybrid deep learning model

Published:2024-07-01 Issue:1 Volume:14 Page:
ISSN:2045-2322
Container-title:Scientific Reports
language:en
Short-container-title:Sci Rep

Author:

Abdu Ahmed,Zhai Zhengjun,Abdo Hakim A.,Algabri Redhwan,Al-masni Mohammed A.,Muhammad Mannan Saeed,Gu Yeong Hyeon

Abstract

AbstractSoftware defect prediction aims to find a reliable method for predicting defects in a particular software project and assisting software engineers in allocating limited resources to release high-quality software products. While most earlier research has concentrated on employing traditional features, current methodologies are increasingly directed toward extracting semantic features from source code. Traditional features often fall short in identifying semantic differences within programs, differences that are essential for the development of reliable and effective prediction models. In contrast, semantic features cannot present statistical metrics about the source code, such as the code size and complexity. Thus, using only one kind of feature negatively affects prediction performance. To bridge the gap between the traditional and semantic features, we propose a novel defect prediction model that integrates traditional and semantic features using a hybrid deep learning approach to address this limitation. Specifically, our model employs a hybrid CNN-MLP classifier: the convolutional neural network (CNN) processes semantic features extracted from projects’ abstract syntax trees (ASTs) using Word2vec. In contrast, the traditional features extracted from the dataset repository are processed by a multilayer perceptron (MLP). Outputs of CNN and MLP are then integrated and fed into a fully connected layer for defect prediction. Extensive experiments are conducted on various open-source projects to validate CNN-MLP’s effectiveness. Experimental results indicate that CNN-MLP can significantly enhance defect prediction performance. Furthermore, CNN-MLP’s improvements outperform existing methods in non-effort-aware and effort-aware cases.

Funder

Sejong University

Publisher

Springer Science and Business Media LLC

Link

https://www.nature.com/articles/s41598-024-65639-4.pdf

Reference67 articles.

1. Jin, C. Cross-project software defect prediction based on domain adaptation learning and optimization. Expert Syst. Appl. 171, 114637 (2021).

2. Abdu, A. et al. Deep learning-based software defect prediction via semantic key features of source code-systematic survey. Mathematics 10, 3120 (2022).

3. Nassif, A. B. et al. Software defect prediction using learning to rank approach. Sci. Rep. 13, 18885 (2023).

4. Subramanyam, R. & Krishnan, M. S. Empirical analysis of ck metrics for object-oriented design complexity: Implications for software defects. IEEE Trans. Softw. Eng. 29, 297–310 (2003).

5. Moser, R., Pedrycz, W. & Succi, G. A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction. In Proceedings of the 30th International Conference on Software Engineering 181–190 (2008).