Limitations of Large Language Models in Propaganda Detection Task

Author:

Szwoch Joanna1,Staszkow Mateusz2,Rzepka Rafal3ORCID,Araki Kenji3ORCID

Affiliation:

1. Graduate School of Information Science and Technology, Hokkaido University, Sapporo 060-0808, Japan

2. Mateusz Staszków Software Development, 01-234 Warsaw, Poland

3. Faculty of Information Science and Technology, Hokkaido University, Sapporo 060-0808, Japan

Abstract

Propaganda in the digital era is often associated with online news. In this study, we focused on the use of large language models and their detection of propaganda techniques in the electronic press to investigate whether it is a noteworthy replacement for human annotators. We prepared prompts for generative pre-trained transformer models to find spans in news articles where propaganda techniques appear and name them. Our study was divided into three experiments on different datasets—two based on an annotated SemEval2020 Task 11 corpora and one on an unannotated subset of the Polish Online News Corpus, which we claim to be an even bigger challenge as an example of an under-resourced language. Reproduction of the results of the first experiment resulted in a higher recall of 64.53% than the original run, and the highest precision of 81.82% was achieved for gpt-4-1106-preview CoT. None of our attempts outperformed the baseline F1 score. One of the attempts with gpt-4-0125-preview on original SemEval2020 Task 11 achieved an almost 20% F1 score, but it was below the baseline, which oscillated around 50%. Part of our work that was dedicated to Polish articles showed that gpt-4-0125-preview had a 74% accuracy in the binary detection of propaganda techniques and 69% in propaganda technique classification. The results for SemEval2020 show that the outputs of generative models tend to be unpredictable and are hardly reproducible for propaganda detection. For the time being, these are unreliable methods for this task, but we believe they can help to generate more training data.

Funder

JSPS Kakenhi

Publisher

MDPI AG

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3