Author:
Ali Hassan,Nepal Surya,S. Kanhere Salil,Jha Sanjay K.
Abstract
<div>We have witnessed the continuing arms race between backdoor attacks and the corresponding defense strategies on Deep Neural Networks (DNNs). However, most state-of-the-art defenses rely on the statistical sanitization of <i>inputs</i> or <i>latent DNN representations</i> to capture trojan behavior. In this paper, we first challenge the robustness of many recently reported defenses by introducing a novel variant of the targeted backdoor attack, called <i>low-confidence backdoor attack</i>. <i>Low-confidence attack</i> inserts the backdoor by assigning uniformly distributed probabilistic labels to the poisoned training samples, and is applicable to many practical scenarios such as Federated Learning and model-reuse cases. We evaluate our attack against five state-of-the-art defense methods, viz., STRIP, Gradient-Shaping, Februus, ULP-defense and ABS-defense, under the same threat model as assumed by the respective defenses and achieve Attack Success Rates (ASRs) of 99\%, 63.73%, 91.2%, 80% and 100%, respectively. After carefully studying the properties of the state-of-the-art attacks, including low-confidence attacks, we present <i>HaS-Net</i>, a mechanism to securely train DNNs against a number of backdoor attacks under the data-collection scenario. For this purpose, we use a reasonably small healing dataset, approximately 2% to 15% the size of training data, to heal the network at each iteration. We evaluate our defense for different datasets---Fashion-MNIST, CIFAR-10, Celebrity Face, Consumer Complaint and Urban Sound---and network architectures---MLPs, 2D-CNNs, 1D-CNNs---and against several attack configurations---standard backdoor attacks, invisible backdoor attacks, label-consistent attack and all-trojan backdoor attack, including their low-confidence variants. Our experiments show that <i>HaS-Nets</i> can decrease ASRs from over 90% to less than 15%, independent of the dataset, attack configuration and network architecture.</div>
Publisher
Institute of Electrical and Electronics Engineers (IEEE)