On the acceptance by code reviewers of candidate security patches suggested by Automated Program Repair tools-Reference-Cited by-同舟云学术

On the acceptance by code reviewers of candidate security patches suggested by Automated Program Repair tools

Published:2024-08-03 Issue:5 Volume:29 Page:
ISSN:1382-3256
Container-title:Empirical Software Engineering
language:en
Short-container-title:Empir Software Eng

Author:

Papotti Aurora^ORCID,Paramitha Ranindya^ORCID,Massacci Fabio^ORCID

Abstract

Abstract Objective We investigated whether (possibly wrong) security patches suggested by Automated Program Repairs (APR) for real world projects are recognized by human reviewers. We also investigated whether knowing that a patch was produced by an allegedly specialized tool does change the decision of human reviewers. Method We perform an experiment with

$$n= 72$$

n = 72 Master students in Computer Science. In the first phase, using a balanced design, we propose to human reviewers a combination of patches proposed by APR tools for different vulnerabilities and ask reviewers to adopt or reject the proposed patches. In the second phase, we tell participants that some of the proposed patches were generated by security-specialized tools (even if the tool was actually a ‘normal’ APR tool) and measure whether the human reviewers would change their decision to adopt or reject a patch. Results It is easier to identify wrong patches than correct patches, and correct patches are not confused with partially correct patches. Also patches from APR Security tools are adopted more often than patches suggested by generic APR tools but there is not enough evidence to verify if ‘bogus’ security claims are distinguishable from ‘true security’ claims. Finally, the number of switches to the patches suggested by security tool is significantly higher after the security information is revealed irrespective of correctness. Limitations The experiment was conducted in an academic setting, and focused on a limited sample of popular APR tools and popular vulnerability types.

Funder

H2020 LEIT Information and Communication Technologies

NWO

HORIZON EUROPE Global Challenges and European Industrial Competitiveness

Nederlandse Organisatie voor Wetenschappelijk Onderzoek

Publisher

Springer Science and Business Media LLC

Link

https://link.springer.com/content/pdf/10.1007/s10664-024-10506-z.pdf

Reference78 articles.

1. Abadi A, Ettinger R, Feldman YA, Shomrat M (2011) Automatically fixing security vulnerabilities in Java code. In: Proc. OOPSLA’11, pp 3–4

2. Agresti A, Coull BA (1998) Approximate is better than “Exact” for interval estimation of binomial proportions. Am Stat 52(2):119–126

3. Alarcon GM, Militello LG, Ryan P, Jessup SA, Calhoun CS, Lyons JB (2017) A descriptive model of computer code trustworthiness. J Cogn Eng Decis Mak 11(2):107–121

4. Alarcon GM, Walter C, Gibson AM, Gamble RF, Capiola A, Jessup SA, Ryan TJ (2020) Would you fix this code for me? effects of repair source and commenting on trust in code repair. Systems 8(1):8

5. Allodi L, Cremonini M, Massacci F, Shim W (2020) Measuring the accuracy of software vulnerability assessments: experiments with students and professionals. Empir Softw Eng 25(2):1063–1094