Analyzing Robustness of Automatic Scientific Claim Verification Tools against Adversarial Rephrasing Attacks-Reference-Cited by-同舟云学术

Analyzing Robustness of Automatic Scientific Claim Verification Tools against Adversarial Rephrasing Attacks

Published:2024-05-02 Issue: Volume: Page:
ISSN:2157-6904
Container-title:ACM Transactions on Intelligent Systems and Technology
language:en
Short-container-title:ACM Trans. Intell. Syst. Technol.

Author:

Layne Janet¹^ORCID,Ratul Qudrat E Alahy¹^ORCID,Serra Edoardo¹^ORCID,Jajodia Sushil²^ORCID

Affiliation:

1. Boise State University, USA

2. George Mason University, USA

Abstract

The coronavirus pandemic has fostered an explosion of misinformation about the disease, including the risk and effectiveness of vaccination. AI tools for automatic Scientific Claim Verification (SCV) can be crucial to defeat misinformation campaigns spreading through social media channels. However, over the past years, many concerns have been raised about the robustness of AI to adversarial attacks, and the field of automatic scientific claim verification is not exempt. The risk is that such SCV tools may reinforce and legitimize the spread of fake scientific claims rather than refute them. This paper investigates the problem of generating adversarial attacks for SCV tools and shows that it is far more difficult than the generic NLP adversarial attack problem. The current NLP adversarial attack generators, when applied to SCV, often generate modified claims with entirely different meaning from the original. Even when the meaning is preserved, the modification of the generated claim is too simplistic (only a single word is changed), leaving many weaknesses of the SCV tools undiscovered. We propose T5-ParEvo, an iterative evolutionary attack generator, that is able to generate more complex and creative attacks while better preserving the semantics of the original claim. Using detailed quantitative and qualitative analysis, we demonstrate the efficacy of T5-ParEvo in comparison with existing attack generators.

Publisher

Association for Computing Machinery (ACM)

Link

https://dl.acm.org/doi/pdf/10.1145/3663481

Reference54 articles.

1. An information-theoretic perspective of tf–idf measures

2. M. Alzantot Y. Sharma A. Elgohary B.J. Ho M. Srivastava and K.W. Chang. 2018. Generating natural language adversarial examples. arXiv preprint arXiv:1804.07998 (2018).

3. I. Beltagy, M. E. Peters, and A. Cohan. 2020. Longformer: The Long-Document Transformer. arXiv:2004.05150 (2020).

4. S. R. Bowman G. Angeli C. Potts and C. D. Manning. 2015. A large annotated corpus for learning natural language inference. arXiv preprint arXiv:1508.05326 (2015).

5. A computer readability formula designed for machine scoring.