Backdozer: A Backdoor Detection Methodology for DRL-based Traffic Controllers

Author:

Wang Yue1ORCID,Li Wenqing2ORCID,Alam Manaar2ORCID,Maniatakos Michail2ORCID,Jabari Saif Eddin2ORCID

Affiliation:

1. New York University Tandon School of Engineering, New York, United States

2. New York University Abu Dhabi, Abu Dhabi, United Arab Emirates

Abstract

While the advent of Deep Reinforcement Learning (DRL) has substantially improved the efficiency of Autonomous Vehicles (AVs), it makes them vulnerable to backdoor attacks that can potentially cause traffic congestion or even collisions. Backdoor functionality is typically implanted by poisoning training datasets with stealthy malicious data, designed to preserve high accuracy on legitimate inputs while inducing desired misclassifications for specific adversary-selected inputs. Existing countermeasures against backdoors predominantly concentrate on image classification, utilizing image-based properties, rendering these methods inapplicable to the regression tasks of DRL-based AV controllers that rely on continuous sensor data as inputs. In this article, we introduce the first-ever defense against backdoors on regression tasks of DRL-based models, called Backdozer . Our method systematically extracts more abstract features from representations of training data by projecting them into a specific latent subspace and segregating them into several disjoint groups based on the distribution of legitimate outputs. The key observation of Backdozer  is that authentic representations for each group reside in one latent subspace, whereas the incorporation of malicious data impacts that subspace. Backdozer  optimizes a sample-wise weight vector for the representations capturing the disparities in projections originating from different groups. We experimentally demonstrate that Backdozer  can attain 100% accuracy in detecting backdoors. We also evaluate its effectiveness against three closely related state-of-the-art defenses.

Funder

NYUAD Center for Interacting Urban Networks (CITIES) under the NYUAD Research Institute Award CG001

Center for CyberSecurity (CCS) under the NYUAD Research Institute Award G1104

Publisher

Association for Computing Machinery (ACM)

Reference42 articles.

1. Eugene Bagdasaryan and Vitaly Shmatikov. 2021. Blind backdoors in deep learning models. In Proceedings of the 30th USENIX Security Symposium. USENIX Association, 1505–1521. Retrieved from https://www.usenix.org/conference/usenixsecurity21/presentation/bagdasaryan

2. Provable defense against backdoor policies in reinforcement learning;Bharti Shubham;Proceedings of the 36th Conference on Neural Information Processing Systems,2022

3. Why Deep Learning Works: A Manifold Disentanglement Perspective

4. Randomized channel shuffling: Minimal-overhead backdoor attack detection without clean datasets;Cai Ruisi;Proceedings of the 36th Conference on Neural Information Processing Systems,2022

5. Bryant Chen, Wilka Carvalho, Nathalie Baracaldo, Heiko Ludwig, Benjamin Edwards, Taesung Lee, Ian M. Molloy, and Biplav Srivastava. 2019. Detecting backdoor attacks on deep neural networks by activation clustering. In Proceedings of the Workshop on Artificial Intelligence Safety 2019 Co-located with the 33rd AAAI Conference on Artificial Intelligence 2019 (AAAI-19). Vol. 2301, CEUR-WS.org. Retrieved from https://ceur-ws.org/Vol-2301/paper_18.pdf

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3