Backdozer: A Backdoor Detection Methodology for DRL-based Traffic Controllers-Reference-Cited by-同舟云学术

Backdozer: A Backdoor Detection Methodology for DRL-based Traffic Controllers

Published:2024-08-09 Issue:4 Volume:1 Page:1-22
ISSN:2833-0528
Container-title:ACM Journal on Autonomous Transportation Systems
language:en
Short-container-title:ACM J. Auton. Transport. Syst.

Author:

Wang Yue¹^ORCID,Li Wenqing²^ORCID,Alam Manaar²^ORCID,Maniatakos Michail²^ORCID,Jabari Saif Eddin²^ORCID

Affiliation:

1. New York University Tandon School of Engineering, New York, United States

2. New York University Abu Dhabi, Abu Dhabi, United Arab Emirates

Abstract

While the advent of Deep Reinforcement Learning (DRL) has substantially improved the efficiency of Autonomous Vehicles (AVs), it makes them vulnerable to backdoor attacks that can potentially cause traffic congestion or even collisions. Backdoor functionality is typically implanted by poisoning training datasets with stealthy malicious data, designed to preserve high accuracy on legitimate inputs while inducing desired misclassifications for specific adversary-selected inputs. Existing countermeasures against backdoors predominantly concentrate on image classification, utilizing image-based properties, rendering these methods inapplicable to the regression tasks of DRL-based AV controllers that rely on continuous sensor data as inputs. In this article, we introduce the first-ever defense against backdoors on regression tasks of DRL-based models, called Backdozer . Our method systematically extracts more abstract features from representations of training data by projecting them into a specific latent subspace and segregating them into several disjoint groups based on the distribution of legitimate outputs. The key observation of Backdozer is that authentic representations for each group reside in one latent subspace, whereas the incorporation of malicious data impacts that subspace. Backdozer optimizes a sample-wise weight vector for the representations capturing the disparities in projections originating from different groups. We experimentally demonstrate that Backdozer can attain 100% accuracy in detecting backdoors. We also evaluate its effectiveness against three closely related state-of-the-art defenses.

Funder

NYUAD Center for Interacting Urban Networks (CITIES) under the NYUAD Research Institute Award CG001

Center for CyberSecurity (CCS) under the NYUAD Research Institute Award G1104

Publisher

Association for Computing Machinery (ACM)

Link

https://dl.acm.org/doi/pdf/10.1145/3639828

Reference42 articles.

1. Eugene Bagdasaryan and Vitaly Shmatikov. 2021. Blind backdoors in deep learning models. In Proceedings of the 30th USENIX Security Symposium. USENIX Association, 1505–1521. Retrieved from https://www.usenix.org/conference/usenixsecurity21/presentation/bagdasaryan

2. Provable defense against backdoor policies in reinforcement learning;Bharti Shubham;Proceedings of the 36th Conference on Neural Information Processing Systems,2022

3. Why Deep Learning Works: A Manifold Disentanglement Perspective

4. Randomized channel shuffling: Minimal-overhead backdoor attack detection without clean datasets;Cai Ruisi;Proceedings of the 36th Conference on Neural Information Processing Systems,2022

5. Bryant Chen, Wilka Carvalho, Nathalie Baracaldo, Heiko Ludwig, Benjamin Edwards, Taesung Lee, Ian M. Molloy, and Biplav Srivastava. 2019. Detecting backdoor attacks on deep neural networks by activation clustering. In Proceedings of the Workshop on Artificial Intelligence Safety 2019 Co-located with the 33rd AAAI Conference on Artificial Intelligence 2019 (AAAI-19). Vol. 2301, CEUR-WS.org. Retrieved from https://ceur-ws.org/Vol-2301/paper_18.pdf