Implications of Minimum Description Length for Adversarial Attack in Natural Language Processing-Reference-Cited by-同舟云学术

Implications of Minimum Description Length for Adversarial Attack in Natural Language Processing

Published:2024-04-24 Issue:5 Volume:26 Page:354
ISSN:1099-4300
Container-title:Entropy
language:en
Short-container-title:Entropy

Author:

Tiwari Kshitiz¹,Zhang Lu¹^ORCID

Affiliation:

1. Department of Electrical Engineering and Computer Science, University of Arkansas, Fayetteville, AR 72701, USA

Abstract

Investigating causality to establish novel criteria for training robust natural language processing (NLP) models is an active research area. However, current methods face various challenges such as the difficulties in identifying keyword lexicons and obtaining data from multiple labeled environments. In this paper, we study the problem of robust NLP from a complementary but different angle: we treat the behavior of an attack model as a complex causal mechanism and quantify its algorithmic information using the minimum description length (MDL) framework. Specifically, we use masked language modeling (MLM) to measure the “amount of effort” needed to transform from the original text to the altered text. Based on that, we develop techniques for judging whether a specified set of tokens has been altered by the attack, even in the absence of the original text data.

Funder

NSF

Publisher

MDPI AG

Link

https://www.mdpi.com/1099-4300/26/5/354/pdf

Reference43 articles.

1. Du, M., Manjunatha, V., Jain, R., Deshpande, R., Dernoncourt, F., Gu, J., Sun, T., and Hu, X. (2021, January 6–11). Towards Interpreting and Mitigating Shortcut Learning Behavior of NLU models. Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Online.

2. Utama, P.A., Moosavi, N.S., and Gurevych, I. (2020, January 16–20). Towards Debiasing NLU Models from Unknown Biases. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Online.

3. Niven, T., and Kao, H.Y. (August, January 28). Probing Neural Network Comprehension of Natural Language Arguments. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy.

4. Wang, X., Wang, H., and Yang, D. (2021). Measure and improve robustness in nlp models: A survey. arXiv.

5. Gururangan, S., Swayamdipta, S., Levy, O., Schwartz, R., Bowman, S.R., and Smith, N.A. (2018). Annotation artifacts in natural language inference data. arXiv.