Attention-guided Adversarial Attack for Video Object Segmentation-Reference-Cited by-同舟云学术

Attention-guided Adversarial Attack for Video Object Segmentation

Published:2023-11-14 Issue:6 Volume:14 Page:1-22
ISSN:2157-6904
Container-title:ACM Transactions on Intelligent Systems and Technology
language:en
Short-container-title:ACM Trans. Intell. Syst. Technol.

Author:

Yao Rui¹,Chen Ying¹,Zhou Yong¹,Hu Fuyuan²,Zhao Jiaqi³,Liu Bing³,Shao Zhiwen³

Affiliation:

1. School of Computer Science and Technology, China University of Mining and Technology, Engineering Research Center of Mine Digitization, Ministry of Education of the Peoples Republic of China, China

2. School of Electronic and Information Engineering, Suzhou University of Science and Technology, China

3. School of Computer Science and Technology, China University of Mining and Technology, China

Abstract

Video Object Segmentation (VOS) methods have made many breakthroughs with the help of the continuous development and advancement of deep learning. However, the deep learning model is vulnerable to malicious adversarial attacks, which mislead the model to make wrong decisions by adding adversarial perturbation that humans cannot perceive to the input image. Threats to deep learning models remind us that video object segmentation methods are also vulnerable to attacks, thereby threatening their security. Therefore, we study adversarial attacks on the VOS task to better identify the vulnerabilities of the VOS method, which in turn provides an opportunity to improve its robustness. In this paper, we propose an attention-guided adversarial attack method, which uses spatial attention blocks to capture features with global dependencies to construct correlations between consecutive video frames, and performs multipath aggregation to effectively integrate spatial-temporal perturbation, thereby guiding the deconvolution network to generate adversarial examples with strong attack capability. Specifically, the class loss function is designed to enable the deconvolution network to better activate noise in other regions and suppress the activation related to the object class based on the enhanced feature map of the object class. At the same time, attentional feature loss is designed to enhance the transferability against attack. The experimental results on the DAVIS dataset show that the proposed attention-guided adversarial attack method can significantly reduce the segmentation accuracy of OSVOS, and the J & F mean on DAVIS 2016 can reach 73.6% drop rate. The generated adversarial examples are also highly transferable to other video object segmentation models.

Funder

National Natural Science Foundation of China

Natural Science Foundation of Jiangsu Province

Xuzhou Key Research and Development Program

Publisher

Association for Computing Machinery (ACM)

Subject

Artificial Intelligence,Theoretical Computer Science

Link

https://dl.acm.org/doi/pdf/10.1145/3617067

Reference70 articles.

1. One-Shot Video Object Segmentation

2. End-to-End Object Detection with Transformers

3. State-Aware Tracker for Real-Time Video Object Segmentation

4. Blazingly Fast Video Object Segmentation with Pixel-Wise Metric Learning

5. Fast and Accurate Online Video Object Segmentation via Tracking Parts

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Multi-rotor UAV Detection Algorithm Based on the Improved Yolov5;2024 IEEE 4th International Conference on Electronic Technology, Communication and Information (ICETCI);2024-05-24