Abstract
The explosive growth of the social media community has increased many kinds of misinformation and is attracting tremendous attention from the research community. One of the most prevalent ways of misleading news is cheapfakes. Cheapfakes utilize non-AI techniques such as unaltered images with false context news to create false news, which makes it easy and “cheap” to create and leads to an abundant amount in the social media community. Moreover, the development of deep learning also opens and invents many domains relevant to news such as fake news detection, rumour detection, fact-checking, and verification of claimed images. Nevertheless, despite the impact on and harmfulness of cheapfakes for the social community and the real world, there is little research on detecting cheapfakes in the computer science domain. It is challenging to detect misused/false/out-of-context pairs of images and captions, even with human effort, because of the complex correlation between the attached image and the veracity of the caption content. Existing research focuses mostly on training and evaluating on given dataset, which makes the proposal limited in terms of categories, semantics and situations based on the characteristics of the dataset. In this paper, to address these issues, we aimed to leverage textual semantics understanding from the large corpus and integrated with different combinations of text-image matching and image captioning methods via ANN/Transformer boosting schema to classify a triple of (image, caption1, caption2) into OOC (out-of-context) and NOOC (no out-of-context) labels. We customized these combinations according to various exceptional cases that we observed during data analysis. We evaluate our approach using the dataset and evaluation metrics provided by the COSMOS baseline. Compared to other methods, including the baseline, our method achieves the highest Accuracy, Recall, and F1 scores.
Funder
University of Economic Ho Chi Minh City (UEH) Vietnam
Subject
Computational Mathematics,Computational Theory and Mathematics,Numerical Analysis,Theoretical Computer Science
Reference47 articles.
1. The emergence of deepfake technology: A review;Westerlund;Technol. Innov. Manag. Rev.,2019
2. Collins, A. Technical Report. Forged Authenticity: Governing Deepfake Risks, 2019.
3. Fazio, L. Out-of-Context Photos Are a Powerful Low-Tech Form of Misinformation. 2022.
4. Thorne, J., Vlachos, A., Christodoulopoulos, C., and Mittal, A. Fever: A large-scale dataset for fact extraction and verification. arXiv, 2018.
5. Wang, W.Y. “ liar, liar pants on fire”: A new benchmark dataset for fake news detection. arXiv, 2017.
Cited by
7 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献