Affiliation:
1. School of Information Engineering, Inner Mongolia University of Science and Technology, Baotou, China
2. China Nanhu Academy of Electronics and Information Technology, Jiaxing, China
Abstract
Although deep learning models show powerful performance, they are still easily deceived by adversarial samples. Some methods for generating adversarial samples have the drawback of high time loss, which is problematic for adversarial training, and the existing adversarial training methods are difficult to adapt to the dynamic nature of the model, so it is still challenging to study an efficient adversarial training method. In this paper, we propose an adversarial training method, the core of which is the improved adversarial sample generation method AGFAT for adversarial training and the improved dynamic adversarial training method AGFAT-DAT. AGFAT uses a word frequency-based approach to identify significant words, filter replacement candidates, and use an efficient semantic constraint module as a means to reduce the time of adversarial sample generation; AGFAT-DAT is a dynamic adversarial training approach that uses a cyclic attack on the model after adversarial training and generates adversarial samples for adversarial training again. It is demonstrated that the proposed method can significantly reduce the generation time of adversarial samples, and the adversarial-trained model can also effectively defend against other types of word-level adversarial attacks.
Reference3 articles.
1. Diversity adversarial training against adversarial attack on deep neural networks;Hyun Kwon;In: Symmetry,2021
2. WordRevert: Adversarial Examples Defence Method for Chinese Text Classification;Enhui Xu;In: IEEE Access,2022
3. Adversarial Training for Improving Model Robustness? Look at Both Prediction and Interpretation;Hanjie Chen;In: Proceedings of the AAAI Conference on Artificial Intelligence,2022