Empowering Propaganda Detection in Resource-Restraint Languages: A Transformer-Based Framework for Classifying Hindi News Articles-Reference-Cited by-同舟云学术

Empowering Propaganda Detection in Resource-Restraint Languages: A Transformer-Based Framework for Classifying Hindi News Articles

Published:2023-11-15 Issue:4 Volume:7 Page:175
ISSN:2504-2289
Container-title:Big Data and Cognitive Computing
language:en
Short-container-title:BDCC

Author:

Chaudhari Deptii¹^ORCID,Pawar Ambika Vishal¹²

Affiliation:

1. Symbiosis Institute of Technology, Symbiosis International (Deemed University), Lavale, Pune 412115, India

2. Persistent University, Persistent Systems Limited, Ramanujan, Blue Ridge, Pune 4011057, India

Abstract

Misinformation, fake news, and various propaganda techniques are increasingly used in digital media. It becomes challenging to uncover propaganda as it works with the systematic goal of influencing other individuals for the determined ends. While significant research has been reported on propaganda identification and classification in resource-rich languages such as English, much less effort has been made in resource-deprived languages like Hindi. The spread of propaganda in the Hindi news media has induced our attempt to devise an approach for the propaganda categorization of Hindi news articles. The unavailability of the necessary language tools makes propaganda classification in Hindi more challenging. This study proposes the effective use of deep learning and transformer-based approaches for Hindi computational propaganda classification. To address the lack of pretrained word embeddings in Hindi, Hindi Word2vec embeddings were created using the H-Prop-News corpus for feature extraction. Subsequently, three deep learning models, i.e., CNN (convolutional neural network), LSTM (long short-term memory), Bi-LSTM (bidirectional long short-term memory); and four transformer-based models, i.e., multi-lingual BERT, Distil-BERT, Hindi-BERT, and Hindi-TPU-Electra, were experimented with. The experimental outcomes indicate that the multi-lingual BERT and Hindi-BERT models provide the best performance, with the highest F1 score of 84% on the test data. These results strongly support the efficacy of the proposed solution and indicate its appropriateness for propaganda classification.

Publisher

MDPI AG

Subject

Artificial Intelligence,Computer Science Applications,Information Systems,Management Information Systems

Link

https://www.mdpi.com/2504-2289/7/4/175/pdf

Reference44 articles.

1. Propaganda analysis in social media: A bibliometric review;Chaudhari;Inf. Discov. Deliv.,2021

2. Kellner, A., Rangosch, L., Wressnegger, C., and Rieck, K. (2019). Political Elections Under (Social) Fire? Analysis and Detection of Propaganda on Twitter, Technische Universität Braunschweig. Available online: http://arxiv.org/abs/1912.04143.

3. Gavrilenko, O., Oliinyk, Y., and Khanko, H. (2020). Analysis of Propaganda Elements Detecting Algorithms in Text Data, Springer International Publishing.

4. Heidarysafa, M., Kowsari, K., Odukoya, T., Potter, P., Barnes, L.E., and Brown, D.E. (2020, June 21). Women in ISIS Propaganda: A Natural Language Processing Analysis of Topics and Emotions in a Comparison with Mainstream Religious Group. Available online: http://arxiv.org/abs/1912.03804.

5. Johnston, A.H., and Weiss, G.M. (December, January 27). Identifying sunni extremist propaganda with deep learning. Proceedings of the 2017 IEEE Symposium Series on Computational Intelligence, Honolulu, HI, USA.

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. An efficient fake news classification model based on ensemble deep learning techniques;Salud, Ciencia y Tecnología - Serie de Conferencias;2024-03-10

2. Modelling information warfare dynamics to counter propaganda using a nonlinear differential equation with a PINN-based learning approach;International Journal of Information Technology;2023-12-30

3. Computers’ Interpretations of Knowledge Representation Using Pre-Conceptual Schemas: An Approach Based on the BERT and Llama 2-Chat Models;Big Data and Cognitive Computing;2023-12-14