Fake News Detection Using a Logistic Regression Model and Natural Language Processing Techniques-Reference-Cited by-同舟云学术

Fake News Detection Using a Logistic Regression Model and Natural Language Processing Techniques

Published:2023-07-14 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Adeyiga Johnson Adeleke¹,Toriola Philip Gbounmi¹,Abioye(Ogunbiyi) Temitope Elizabeth¹,Oluwatosin Adebisi Esther¹,Arogundade oluwasefunmi 'Tale²

Affiliation:

1. Bells University of Technology

2. Federal University of Agriculture

Abstract

Abstract The proliferation of fake news has become a significant challenge in recent years, impacting democracy, the journalism industry, and people's daily lives. The spread of intentionally misleading or fabricated information has led to a decline in confidence in government institutions and has profound implications for people's daily lives. This study aims to detect false information and real news using logistic regression algorithms and natural language processing techniques, implement the model using Python, and develop a website for news classification. The "Fake News Detection" dataset from Kaggle, consisting of approximately 20,000 news articles labelled as real or fake, was used. Data cleaning was done and feature extraction techniques, including Term Frequency – Inverse Document Frequency (TF-IDF) vectorization, was applied to extract features from the data. The logistic regression model with K-Nearest Neighbour (KNN), Passive Aggressive classifier and Naïve Bayes model were trained on the extracted features and evaluated using various metrics. The system was implemented using Python and Google Collaboratory, with the front-end of the website developed using Hyper Text Markup Language (HTML), Cascading Style Sheets (CSS), and JavaScript. The architecture of the system involves training the model and deploying it using Flask, a lightweight web framework. Evaluation of the classifiers show the following results: Accuracy - 97.90%, 82.92%, 91.32%, 89.71%; Precision - 96.59%, 80.25%, 94.13%, 93.88%; Recall - 99.32%, 94.08%, 90.88%, 89.14%; and F1 score - 97.94%, 86.62, 92.48, 91.45% for logistic regression, KNN, Passive Aggressive and Naïve Bayes classifier respectively. Based on these results, logistic regression outperformed the other three classifiers. This shows that logistic regression model is more effective in fake news detection. The developed system provides a valuable tool in combating fake news and contributes to the on-going research in automatic fake news detection using machine learning.

Publisher

Research Square Platform LLC

Reference16 articles.

1. Fake News Detection using Machine Learning: A Review;Goyal P;International Journal of Advanced Engineering, Management and Science (IJAEMS),2021

2. X. Zhou, R. Zafarani, K. Shu, and H. Liu, “Fake News: Fundamental theories, detection strategies and challenges,” WSDM 2019 - Proceedings of the 12th ACM International Conference on Web Search and Data Mining, pp. 836–837, Jan. 2019, doi: 10.1145/3289600.3291382.

3. S. Hakak, W. Z. Khan, S. Bhattacharya, G. T. Reddy, and K. K. R. Choo, “Propagation of Fake News on Social Media: Challenges and Opportunities,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 12575 LNCS, pp. 345–353, 2020, doi: 10.1007/978-3-030-66046-8_28.

4. Social Media and Fake News in the 2016 Election;Allcott H;Journal of Economic Perspectives,2017

5. P. Kulkarni, S. Karwande, R. Keskar, P. Kale, and S. Iyer, “Fake News Detection using Machine Learning,” ITM Web of Conferences, vol. 40, p. 03003, 2021, doi: 10.1051/itmconf/20214003003.

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Detection and Prediction of Future Mental Disorder From Social Media Data Using Machine Learning, Ensemble Learning, and Large Language Models;IEEE Access;2024