Comparative analysis of machine learning methods to detect fake news in an Urdu language <i>corpus</i>-Reference-Cited by-同舟云学术

Comparative analysis of machine learning methods to detect fake news in an Urdu language corpus

Published:2022-06-28 Issue: Volume:8 Page:e1004
ISSN:2376-5992
Container-title:PeerJ Computer Science
language:en
Short-container-title:

Author:

Rafique Adnan¹^ORCID,Rustam Furqan²^ORCID,Narra Manideep³^ORCID,Mehmood Arif⁴,Lee Ernesto⁵^ORCID,Ashraf Imran⁶^ORCID

Affiliation:

1. Department of Computer Science, COMSATS Institute of Information Technology, Lahore, Lahore, Pakistan

2. Department of Software Engineering, University of Management and Technology, Lahore, Pakistan

3. Indiana Institute of Technology, Fort Wayne, United States

4. Department of CS and IT, Islamia University, Bahawalpur, Bahawalpur, Pakistan

5. School of Engineering and Technology, Miami Dade College, Miami, FL, USA

6. Information and Communication Engineering, Yeungnam University, Gyeongsan si, Daegu, South Korea

Abstract

Wide availability and large use of social media enable easy and rapid dissemination of news. The extensive spread of engineered news with intentionally false information has been observed over the past few years. Consequently, fake news detection has emerged as an important research area. Fake news detection in the Urdu language spoken by more than 230 million people has not been investigated very well. This study analyzes the use and efficacy of various machine learning classifiers along with a deep learning model to detect fake news in the Urdu language. Logistic regression, support vector machine, random forest (RF), naive Bayes, gradient boosting, and passive aggression have been utilized to this end. The influence of term frequency-inverse document frequency and BoW features has also been investigated. For experiments, a manually collected dataset that contains 900 news articles was used. Results suggest that RF performs better and achieves the highest accuracy of 0.92 for Urdu fake news with BoW features. In comparison with machine learning models, neural networks models long short term memory, and multi-layer perceptron are used. Machine learning models tend to show better performance than deep learning models.

Funder

Florida Center for Advanced Analytics and Data Science funded by Ernesto.Net

Publisher

PeerJ

Subject

General Computer Science

Link

https://peerj.com/articles/cs-1004.pdf

Reference37 articles.

1. A closer look at fake news detection: a deep learning perspective;Abedalla,2019

2. Urdu text genre identification;Adeeba,2016

3. Fake news detection using a blend of neural networks: an application of deep learning;Agarwal;SN Computer Science,2020

4. Comparison between XGBoost, LightGBM and CatBoost using a home credit dataset;Al Daoud;International Journal of Computer and Information Engineering,2019

5. UrduFake@FIRE2020: shared track on fake news identification in Urdu;Amjad,2020a

Cited by 10 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Detecting Urdu COVID-19 misinformation using transfer learning;Social Network Analysis and Mining;2024-07-24

2. Monitoring Social Networking Platforms to Detect and Filter Fake News using Ensemble Learning;2024-01-05

3. Integrating Social Explanations Into Explainable Artificial Intelligence (XAI) for Combating Misinformation: Vision and Challenges;IEEE Transactions on Computational Social Systems;2024

4. Urdu Sentiment Analysis: A Review;Lecture Notes in Networks and Systems;2024

5. A Review of Deep Learning Based Sentimental Approach to Identifying Counterfeit Files in Social Networking;Information Systems Engineering and Management;2024