Affiliation:
1. Bells University of Technology
2. Federal University of Agriculture
Abstract
Abstract
The proliferation of fake news has become a significant challenge in recent years, impacting democracy, the journalism industry, and people's daily lives. The spread of intentionally misleading or fabricated information has led to a decline in confidence in government institutions and has profound implications for people's daily lives. This study aims to detect false information and real news using logistic regression algorithms and natural language processing techniques, implement the model using Python, and develop a website for news classification. The "Fake News Detection" dataset from Kaggle, consisting of approximately 20,000 news articles labelled as real or fake, was used. Data cleaning was done and feature extraction techniques, including Term Frequency – Inverse Document Frequency (TF-IDF) vectorization, was applied to extract features from the data. The logistic regression model with K-Nearest Neighbour (KNN), Passive Aggressive classifier and Naïve Bayes model were trained on the extracted features and evaluated using various metrics. The system was implemented using Python and Google Collaboratory, with the front-end of the website developed using Hyper Text Markup Language (HTML), Cascading Style Sheets (CSS), and JavaScript. The architecture of the system involves training the model and deploying it using Flask, a lightweight web framework. Evaluation of the classifiers show the following results: Accuracy - 97.90%, 82.92%, 91.32%, 89.71%; Precision - 96.59%, 80.25%, 94.13%, 93.88%; Recall - 99.32%, 94.08%, 90.88%, 89.14%; and F1 score - 97.94%, 86.62, 92.48, 91.45% for logistic regression, KNN, Passive Aggressive and Naïve Bayes classifier respectively. Based on these results, logistic regression outperformed the other three classifiers. This shows that logistic regression model is more effective in fake news detection. The developed system provides a valuable tool in combating fake news and contributes to the on-going research in automatic fake news detection using machine learning.
Publisher
Research Square Platform LLC