Problems with SZZ and features: An empirical study of the state of practice of defect prediction data collection-Reference-Cited by-同舟云学术

Problems with SZZ and features: An empirical study of the state of practice of defect prediction data collection

Published:2022-01-18 Issue:2 Volume:27 Page:
ISSN:1382-3256
Container-title:Empirical Software Engineering
language:en
Short-container-title:Empir Software Eng

Author:

Herbold Steffen^ORCID,Trautsch Alexander,Trautsch Fabian,Ledel Benjamin

Abstract

Abstract Context The SZZ algorithm is the de facto standard for labeling bug fixing commits and finding inducing changes for defect prediction data. Recent research uncovered potential problems in different parts of the SZZ algorithm. Most defect prediction data sets provide only static code metrics as features, while research indicates that other features are also important. Objective We provide an empirical analysis of the defect labels created with the SZZ algorithm and the impact of commonly used features on results. Method We used a combination of manual validation and adopted or improved heuristics for the collection of defect data. We conducted an empirical study on 398 releases of 38 Apache projects. Results We found that only half of the bug fixing commits determined by SZZ are actually bug fixing. If a six-month time frame is used in combination with SZZ to determine which bugs affect a release, one file is incorrectly labeled as defective for every file that is correctly labeled as defective. In addition, two defective files are missed. We also explored the impact of the relatively small set of features that are available in most defect prediction data sets, as there are multiple publications that indicate that, e.g., churn related features are important for defect prediction. We found that the difference of using more features is not significant. Conclusion Problems with inaccurate defect labels are a severe threat to the validity of the state of the art of defect prediction. Small feature sets seem to be a less severe threat.

Funder

Deutsche Forschungsgemeinschaft

Technische Universität Clausthal

Publisher

Springer Science and Business Media LLC

Subject

Software

Link

https://link.springer.com/content/pdf/10.1007/s10664-021-10092-4.pdf

Reference85 articles.

1. Altinger H, Siegl S, Dajsuren Y, Wotawa F (2015) A novel industry grade dataset for fault prediction based on model-driven developed automotive embedded software. In: Proceedings of the 12th Working Conference on Mining Software Repositories, IEEE Press, Piscataway, NJ, USA, MSR ’15, pp 494–497. http://dl.acm.org/citation.cfm?id=2820518.2820596

2. Antoniol G, Ayari K, Di Penta M, Khomh F, Guéhéneuc YG (2008) Is it a bug or an enhancement? a text-based approach to classify change requests. In: Proceedings of the 2008 Conference of the Center for Advanced Studies on Collaborative Research: Meeting of Minds, Association for Computing Machinery, New York, NY, USA, CASCON ’08 https://doi.org/10.1145/1463788.1463819.

3. Bird C, Bachmann A, Aune E, Duffy J, Bernstein A, Filkov V, Devanbu P (2009a) Fair and balanced?: Bias in bug-fix datasets. In: Proceedings of the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on The Foundations of Software Engineering, ACM, New York, NY, USA, ESEC/FSE ’09, pp 121–130 https://doi.org/10.1145/1595696.1595716.

4. Bird C, Rigby PC, Barr ET, Hamilton DJ, German DM, Devanbu P (2009b) The promises and perils of mining git. In: 2009 6th IEEE International Working Conference on Mining Software Repositories, pp 1–10 https://doi.org/10.1109/MSR.2009.5069475

5. Bird C, Bachmann A, Rahman F, Bernstein A (2010) Linkster: Enabling efficient manual inspection and annotation of mined data. In: Proceedings of the Eighteenth ACM SIGSOFT International Symposium on Foundations of Software Engineering, ACM, New York, NY, USA, FSE ’10, pp 369–370 https://doi.org/10.1145/1882291.1882352.

Cited by 34 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Towards a framework for reliable performance evaluation in defect prediction;Science of Computer Programming;2024-12

2. Unveiling the impact of unchanged modules across versions on the evaluation of within‐project defect prediction models;Journal of Software: Evolution and Process;2024-08-02

3. On Refining the SZZ Algorithm with Bug Discussion Data;Empirical Software Engineering;2024-07-24

4. MineCPP: Mining Bug Fix Pairs and Their Structures;Companion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering;2024-07-10

5. Cleaning Up Confounding: Accounting for Endogeneity Using Instrumental Variables and Two-Stage Models;ACM Transactions on Software Engineering and Methodology;2024-06-28