Machine Learning Fake News Classification with Optimal Feature Selection-Reference-Cited by-同舟云学术

Machine Learning Fake News Classification with Optimal Feature Selection

Published:2021-09-20 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Fayaz Muhammad¹,Khan Atif²^ORCID,Bilal Muhammad²,Khan Sanaullah³

Affiliation:

1. University of Peshawar

2. Islamia College Peshawar

3. Kohat University of Science and Technology

Abstract

Abstract Nowadays, information is published in newspapers and social media while transmitted on radio and television about current events and specific fields of interest nationwide and abroad. It becomes difficult to explicit what is real and what is fake due to the explosive growth of online content. As a result, fake news has become epidemic and immensely challenging to analyze fake news to be verified by the producers in the form of data process outlets not to mislead the people. Indeed, it is a big challenge to the government and public to debate the situation depending on case to case. For the purpose several websites were developed for this purpose to classify the news as either real or fake depending on the website logic and algorithm. A mechanism has to be taken on fact-checking rumors and statements, particularly those that get thousands of views and likes before being debunked and refuted by expert sources. Various machine learning techniques have been used to detect and correctly classified of fake news. However, these approaches are restricted in terms of accuracy. This study has applied a Random Forest (RF) classifier to predict fake or real news. For this prpose, twenty-three (23) textual features are extracted from ISOT Fake News Dataset. Four best feature selection techniques like Chi2, Univariate, information gain and Feature importance are used for selecting fourteen best features out of twenty-three. The proposed model and other benchmark techniques are evaluated on the dataset by using best features. Experimental findings show that, the proposed model outperformed state-of-the-art machine learning techniques such as GBM, XGBoost and Ada Boost Regression Model in terms of classification accuracy.

Publisher

Research Square Platform LLC

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Dilated Long Short-Term Memory Network Augmentation for Precise Fake News Classification;Algorithms for Intelligent Systems;2024

2. A Systematic Literature Review and Meta-Analysis of Studies on Online Fake News Detection;Information;2022-11-04