Author:
van Dinter Raymon,Catal Cagatay,Giray Görkem,Tekinerdogan Bedir
Abstract
AbstractJust-in-time defect prediction (JITDP) research is increasingly focused on program changes instead of complete program modules within the context of continuous integration and continuous testing paradigm. Traditional machine learning-based defect prediction models have been built since the early 2000s, and recently, deep learning-based models have been designed and implemented. While deep learning (DL) algorithms can provide state-of-the-art performance in many application domains, they should be carefully selected and designed for a software engineering problem. In this research, we evaluate the performance of traditional machine learning algorithms and data sampling techniques for JITDP problems and compare the model performance with the performance of a DL-based prediction model. Experimental results demonstrated that DL algorithms leveraging sampling methods perform significantly worse than the decision tree-based ensemble method. The XGBoost-based model appears to be 116 times faster than the multilayer perceptron-based (MLP) prediction model. This study indicates that DL-based models are not always the optimal solution for software defect prediction, and thus, shallow, traditional machine learning can be preferred because of better performance in terms of accuracy and time parameters.
Publisher
Springer Science and Business Media LLC
Subject
Safety, Risk, Reliability and Quality,Software
Reference52 articles.
1. Alan, O., & Catal, C. (2011). Thresholds based outlier detection approach for mining class outliers: An empirical case study on software measurement datasets. Expert Systems with Applications, 38, 3440–3445.
2. Arık, S. Ö., & Le, L. T. (2020). TabNet on AI Platform: High-performance, Explainable Tabular Learning. https://cloud.google.com/blog/products/ai-machine-learning/ml-model-tabnet-is-easy-to-use-on-cloud-ai-platform/
3. Arık, S. Ö., & Pfister, T. (2021). Tabnet: attentive interpretable tabular learning. AAAI Conference on Artificial Intelligence, 35(8), 6679–6687.
4. Bennin, K. E., Keung, J., Phannachitta, P., Monden, A., & Mensah, S. (2017). Mahakil: Diversity based oversampling approach to alleviate the class imbalance issue in software defect prediction. IEEE Transactions on Software Engineering, 44, 534–550.
5. Bennin, K. E., Keung, J. W., & Monden, A. (2019). On the relative value of data resampling approaches for software defect prediction. Empirical Software Engineering, 24, 602–636.
Cited by
5 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献