A Comparative Study of Commit Representations for JIT Vulnerability Prediction
-
Published:2024-01-11
Issue:1
Volume:13
Page:22
-
ISSN:2073-431X
-
Container-title:Computers
-
language:en
-
Short-container-title:Computers
Author:
Aladics Tamás12ORCID, Hegedűs Péter1ORCID, Ferenc Rudolf1ORCID
Affiliation:
1. Department of Sofware Engineering, University of Szeged, 6720 Szeged, Hungary 2. FrontEndART Ltd., 6720 Szeged, Hungary
Abstract
With the evolution of software systems, their size and complexity are rising rapidly. Identifying vulnerabilities as early as possible is crucial for ensuring high software quality and security. Just-in-time (JIT) vulnerability prediction, which aims to find vulnerabilities at the time of commit, has increasingly become a focus of attention. In our work, we present a comparative study to provide insights into the current state of JIT vulnerability prediction by examining three candidate models: CC2Vec, DeepJIT, and Code Change Tree. These unique approaches aptly represent the various techniques used in the field, allowing us to offer a thorough description of the current limitations and strengths of JIT vulnerability prediction. Our focus was on the predictive power of the models, their usability in terms of false positive (FP) rates, and the granularity of the source code analysis they are capable of handling. For training and evaluation, we used two recently published datasets containing vulnerability-inducing commits: ProjectKB and Defectors. Our results highlight the trade-offs between predictive accuracy and operational flexibility and also provide guidance on the use of ML-based automation for developers, especially considering false positive rates in commit-based vulnerability prediction. These findings can serve as crucial insights for future research and practical applications in software security.
Funder
European Union Ministry of Innovation and Technology of Hungary from the National Research, Development and Innovation Fund EU-funded project Sec4AI4Sec
Reference38 articles.
1. (2023, December 20). The 2021 Threat Landscape Retrospective: Targeting the Vulnerabilities that Matter Most. Available online: https://www.tenable.com/cyber-exposure/2021-threat-landscape-retrospective. 2. Hoang, T., Khanh Dam, H., Kamei, Y., Lo, D., and Ubayashi, N. (2019, January 25–31). DeepJIT: An End-to-End Deep Learning Framework for Just-in-Time Defect Prediction. Proceedings of the 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR), Montreal, QC, Canada. 3. Meneely, A., Srinivasan, H., Musa, A., Tejeda, A.R., Mokary, M., and Spates, B. (2013, January 10–11). When a Patch Goes Bad: Exploring the Properties of Vulnerability-Contributing Commits. Proceedings of the 2013 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, Baltimore, MD, USA. 4. Morrison, P., Herzig, K., Murphy, B., and Williams, L. (2015, January 21–22). Challenges with Applying Vulnerability Prediction Models. Proceedings of the 2015 Symposium and Bootcamp on the Science of Security, HotSoS ’15, Urbana, IL, USA. 5. Hogan, K., Warford, N., Morrison, R., Miller, D., Malone, S., and Purtilo, J. (2019, January 27–30). The Challenges of Labeling Vulnerability-Contributing Commits. Proceedings of the 2019 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW), Berlin, Germany.
|
|