DeepInspect: A Black-box Trojan Detection and Mitigation Framework for Deep Neural Networks

Author:

Chen Huili1,Fu Cheng1,Zhao Jishen1,Koushanfar Farinaz1

Affiliation:

1. University of California, San Diego

Abstract

Deep Neural Networks (DNNs) are vulnerable to Neural Trojan (NT) attacks where the adversary injects malicious behaviors during DNN training. This type of ‘backdoor’ attack is activated when the input is stamped with the trigger pattern specified by the attacker, resulting in an incorrect prediction of the model. Due to the wide application of DNNs in various critical fields, it is indispensable to inspect whether the pre-trained DNN has been trojaned before employing a model. Our goal in this paper is to address the security concern on unknown DNN to NT attacks and ensure safe model deployment. We propose DeepInspect, the first black-box Trojan detection solution with minimal prior knowledge of the model. DeepInspect learns the probability distribution of potential triggers from the queried model using a conditional generative model, thus retrieves the footprint of backdoor insertion. In addition to NT detection, we show that DeepInspect’s trigger generator enables effective Trojan mitigation by model patching. We corroborate the effectiveness, efficiency, and scalability of DeepInspect against the state-of-the-art NT attacks across various benchmarks. Extensive experiments show that DeepInspect offers superior detection performance and lower runtime overhead than the prior work.

Publisher

International Joint Conferences on Artificial Intelligence Organization

Cited by 117 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Towards robustness evaluation of backdoor defense on quantized deep learning models;Expert Systems with Applications;2024-12

2. OCGEC: One-class Graph Embedding Classification for DNN Backdoor Detection;2024 International Joint Conference on Neural Networks (IJCNN);2024-06-30

3. Robust and privacy-preserving collaborative training: a comprehensive survey;Artificial Intelligence Review;2024-06-20

4. Nightshade: Prompt-Specific Poisoning Attacks on Text-to-Image Generative Models;2024 IEEE Symposium on Security and Privacy (SP);2024-05-19

5. TrojanPuzzle: Covertly Poisoning Code-Suggestion Models;2024 IEEE Symposium on Security and Privacy (SP);2024-05-19

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3