Abstract
AbstractDefenses against adversarial attacks are essential to ensure the reliability of machine-learning models as their applications are expanding in different domains. Existing ML defense techniques have several limitations in practical use. We proposed a trustworthy framework that employs an adaptive strategy to inspect both inputs and decisions. In particular, data streams are examined by a series of diverse filters before sending to the learning system and then crossed checked its output through anomaly (outlier) detectors before making the final decision. Experimental results (using benchmark data-sets) demonstrated that our dual-filtering strategy could mitigate adaptive or advanced adversarial manipulations for wide-range of ML attacks with higher accuracy. Moreover, the output decision boundary inspection with a classification technique automatically affirms the reliability and increases the trustworthiness of any ML-based decision support system. Unlike other defense techniques, our dual-filtering strategy does not require adversarial sample generation and updating the decision boundary for detection, makes the ML defense robust to adaptive attacks.
Publisher
Springer Science and Business Media LLC
Subject
Computational Mathematics,Engineering (miscellaneous),Information Systems,Artificial Intelligence
Reference102 articles.
1. Adam GA, Smirnov P, Goldenberg A, Duvenaud D, Haibe-Kains B (2018) Stochastic combinatorial ensembles for defending against adversarial examples, pp 1–15. arXiv preprint arXiv:1808.06645
2. Aggarwal CC (2015) Outlier analysis. Aggarwal CC (ed) Data mining. Springer , Cham, pp 237–263
3. Aigrain J, Detyniecki M (2019) Detecting adversarial examples and other misclassifications in neural networks by introspection. arXiv preprint arXiv:1905.09186
4. Akhtar Z, Monteiro J, Falk TH (2018) Adversarial examples detection using no-reference image quality features. In: 2018 international Carnahan conference on security technology (ICCST), pp 1–5
5. Akhtar Z, Monteiro J, Falk TH (2018) Adversarial examples detection using no-reference image quality features. In: 2018 international Carnahan conference on security technology (ICCST). IEEE, pp 1–5
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献