Affiliation:
1. Department of Industrial Security Governance, Inha University, Michuhol-gu, Incheon 22212, Republic of Korea
Abstract
This study investigates the efficacy of advanced large language models, specifically GPT-4o, Claude-3.5 Sonnet, and GPT-3.5 Turbo, in detecting software vulnerabilities. Our experiment utilized vulnerable and secure code samples from the NIST Software Assurance Reference Dataset (SARD), focusing on C++, Java, and Python. We employed three distinct prompting techniques as follows: Concise, Tip Setting, and Step-by-Step. The results demonstrate that GPT-4o and Claude-3.5 Sonnet significantly outperform GPT-3.5 Turbo in vulnerability detection. GPT-4o showed the highest improvement with the Step-by-Step prompt, achieving an F1 score of 0.9072. Claude-3.5 Sonnet exhibited consistent high performance across all prompt types, with its Step-by-Step prompt yielding the best overall results (F1 score: 0.8933, AUC: 0.74). In contrast, GPT-3.5 Turbo showed minimal performance changes across prompts, with the Tip Setting prompt performing best (AUC: 0.65, F1 score: 0.6772), yet significantly lower than the other models. Our findings highlight the potential of advanced models in enhancing software security and underscore the importance of prompt engineering in optimizing their performance.
Funder
The Ministry of Education of the Republic of Korea and the National Research Foundation of Korea
Reference31 articles.
1. (2024, June 05). The Fourth Industrial Revolution. Available online: https://www.sogeti.com/globalassets/global/special/sogeti-things3en.pdf.
2. Lee, M., Yun, J.J., Pyka, A., Won, D., Kodama, F., Schiuma, G., Park, H., Jeon, J., Park, K., and Jung, K. (2018). How to Respond to the Fourth Industrial Revolution or the Second Information Technology Revolution? Dynamic New Combinations between Technology, Market, and Society through Open Innovation. J. Open Innov. Technol. Mark. Complex, 4.
3. Challenges and solutions of information security issues in the age of big data;Yang;China Commun.,2016
4. Abu Al-Haija, Q. (2022). Top-down machine learning-based architecture for cyberattack identification and classification in IoT communication networks. Front. Big. Data, 4.
5. Aslan, Ö., Aktuğ, S.S., Ozkan-Okay, M., Yilmaz, A.A., and Akin, E. (2023). A Comprehensive Review of Cyber Security Vulnerabilities, Threats, Attacks, and Solutions. Electronics, 12.