Evolutionary Multi-objective Optimization for Contextual Adversarial Example Generation

Author:

Zhou Shasha1ORCID,Huang Mingyu2ORCID,Sun Yanan3ORCID,Li Ke4ORCID

Affiliation:

1. University of Electronic Science and Technology of China, Chengdu, China / University of Exeter, Exeter, United Kingdom

2. University of Electronic Science and Technology of China, Chengdu, China

3. Sichuan University, Chengdu, China

4. University of Exeter, Exeter, United Kingdom

Abstract

The emergence of the 'code naturalness' concept, which suggests that software code shares statistical properties with natural language, paves the way for deep neural networks (DNNs) in software engineering (SE). However, DNNs can be vulnerable to certain human imperceptible variations in the input, known as adversarial examples (AEs), which could lead to adverse model performance. Numerous attack strategies have been proposed to generate AEs in the context of computer vision and natural language processing, but the same is less true for source code of programming languages in SE. One of the challenges is derived from various constraints including syntactic, semantics and minimal modification ratio. These constraints, however, are subjective and can be conflicting with the purpose of fooling DNNs. This paper develops a multi-objective adversarial attack method (dubbed MOAA), a tailored NSGA-II, a powerful evolutionary multi-objective (EMO) algorithm, integrated with CodeT5 to generate high-quality AEs based on contextual information of the original code snippet. Experiments on 5 source code tasks with 10 datasets of 6 different programming languages show that our approach can generate a diverse set of high-quality AEs with promising transferability. In addition, using our AEs, for the first time, we provide insights into the internal behavior of pre-trained models.

Publisher

Association for Computing Machinery (ACM)

Reference92 articles.

1. Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey

2. Miltiadis Allamanis, Earl T. Barr, Premkumar T. Devanbu, and Charles Sutton. 2018. A Survey of Machine Learning for Big Code and Naturalness. ACM Comput. Surv., 51, 4 (2018), 81:1–81:37.

3. Bander Alsulami, Edwin Dauber, Richard E. Harang, Spiros Mancoridis, and Rachel Greenstadt. 2017. Source Code Authorship Attribution Using Long Short-Term Memory Based Networks. In ESORICS’17: Proc. of the 22nd European Symposium on Research in Computer Security. 10492, 65–82.

4. HypE: An Algorithm for Fast Hypervolume-Based Many-Objective Optimization

5. Patrick Bareiß Beatriz Souza Marcelo d’Amorim and Michael Pradel. 2022. Code Generation Tools (Almost) for Free? A Study of Few-Shot Pre-Trained Language Models on Code. CoRR.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3