RUN-AS: a novel approach to annotate news reliability for disinformation detection
-
Published:2023-08-06
Issue:
Volume:
Page:
-
ISSN:1574-020X
-
Container-title:Language Resources and Evaluation
-
language:en
-
Short-container-title:Lang Resources & Evaluation
Author:
Bonet-Jover AlbaORCID, Sepúlveda-Torres Robiert, Saquete Estela, Martínez-Barco Patricio, Nieto-Pérez Mario
Abstract
AbstractThe development of the internet and digital technologies has inadvertently facilitated the huge disinformation problem that faces society nowadays. This phenomenon impacts ideologies, politics and public health. The 2016 US presidential elections, the Brexit referendum, the COVID-19 pandemic and the Russia-Ukraine war have been ideal scenarios for the spreading of fake news and hoaxes, due to the massive dissemination of information. Assuming that fake news mixes reliable and unreliable information, we propose RUN-AS (Reliable and Unreliable Annotation Scheme), a fine-grained annotation scheme that enables the labelling of the structural parts and essential content elements of a news item and their classification into Reliable and Unreliable. This annotation proposal aims to detect disinformation patterns in text and to classify the global reliability of news. To this end, a dataset in Spanish was built and manually annotated with RUN-AS and several experiments using this dataset were conducted to validate the annotation scheme by using Machine Learning (ML) and Deep Learning (DL) algorithms. The experiments evidence the validity of the annotation scheme proposed, obtaining the best $$\textbf F_\textbf 1\textbf m$$
F
1
m
, 0.948, with the Decision Tree algorithm.
Funder
Conselleria de Innovación, Universidades, Ciencia y Sociedad Digital, Generalitat Valenciana Ministerio de Ciencia, Innovación y Universidades Ministerio de Ciencia e Innovación Conselleria de Cultura, Educación y Ciencia, Generalitat Valenciana Universidad de Alicante
Publisher
Springer Science and Business Media LLC
Subject
Library and Information Sciences,Linguistics and Language,Education,Language and Linguistics
Reference39 articles.
1. Assaf, R., & Saheb, M. (2021). Dataset for arabic fake news. In 2021 IEEE 15th International Conference on Application of Information and Communication Technologies (AICT), IEEE (pp. 1–4). 2. Bergmeir, C., & Benítez, J. M. (2012). On the use of cross-validation for time series predictor evaluation. Information Sciences, 191, 192–213. 3. Cañete, J., Chaperon, G., Fuentes, R., Ho, J. H., Kang, H., & Pérez, J. (2020). Spanish pre-trained bert model and evaluation data. Pml4dc at iclr, 20202020, 1-10 4. Chakma, K., & Das, A. (2018). A 5w1h based annotation scheme for semantic role labeling of english tweets. Computación y Sistemas, 22(3), 747–755. 5. Chakma, K., Swamy, S. D., Das, A., & Debbarma, S. (2020). 5w1h-based semantic segmentation of tweets for event detection using bert. In International Conference on Machine Learning, Image Processing, Network Security and Data Sciences (pp 57–72). Springer
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|