An Automated Data-driven Machine Intelligence Framework for Mining Knowledge To Classify Fake News Using NLP
-
Published:2023-07-07
Issue:
Volume:
Page:
-
ISSN:2375-4699
-
Container-title:ACM Transactions on Asian and Low-Resource Language Information Processing
-
language:en
-
Short-container-title:ACM Trans. Asian Low-Resour. Lang. Inf. Process.
Author:
Mundra Shikha1, Reddy Jaiwanth1, Mundra Ankit1, Mittal Namita2, Vidyarthi Ankit3, Gupta Deepak4
Affiliation:
1. Manipal University Jaipur, India 2. Malaviya National Institute Of Technology, Jaipur, India 3. Jaypee Institute of Information Technology Noida Department of CSE&IT, India 4. Maharaja Agrasen Institute of Technology, Delhi and Chandigarh University, Mohali Department of CSE and Reseach Advisor, UCRD, Mohali, India
Abstract
The rapid spread of fake news has become a serious concern over the internet. In recent years, social media platforms are widely used for news consumption. These platforms are excellent for their low-cost accessibility and rapid dissemination of news. Contrariwise, it encourages the rapid propagation of ’fake news,’ or low-quality news containing intentionally misleading content. The quick dissemination of fake news has the potential to have devastating consequences for individuals and society as a whole. Therefore, to overcome this problem, this paper proposed an artificial intelligence framework that incorporates ensembles of deep learning features for the classification of fake news. Deep learning approaches such as Multilayer Perceptron (MLP), Convolutional Neural Networks (CNN), and Bidirectional Long Short Term Memory (BILSTM) have been used to extract local and sequential features. To obtain relevant features at the word level, these approaches are initialized using pretrained GLOVE word embedding, which results in, three base learners as GLOVE+MLP, GLOVE+CNN, and GLOVE+BiLSTM. Moreover, to extract features at the sentence level, Bidirectional Encoder Representations from Transformers (BERT) are adopted, which results in, three more base learners as BERT+MLP, BERT+CNN, BERT+BiLSTM. In total, six models are employed as base learners. Later, predictions from the best of these models are ensembled and performance is computed using ensembling techniques. Overall, we have investigated nine ensembling techniques, including weighted voting, bagging, boosting, stacked ensembles like SVC, and logistic regression. The performance is computed using four publicly available datasets regarding the macro average f1-score. We observed that soft weighted voting-based ensemble outperformed other models on three datasets achieving an f1-score of 92.99% (McIntyre), 95.22% (Kaggle), and 78.3% (Gossipcop).
Publisher
Association for Computing Machinery (ACM)
Subject
General Computer Science
Reference52 articles.
1. Martín Abadi , Ashish Agarwal , Paul Barham , Eugene Brevdo , Zhifeng Chen , Craig Citro , Greg S Corrado , Andy Davis , Jeffrey Dean , Matthieu Devin , et al . 2016 . Tensorflow : Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467(2016). https://doi.org/10.48550/arXiv.1603.04467 10.48550/arXiv.1603.04467 Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, et al. 2016. Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467(2016). https://doi.org/10.48550/arXiv.1603.04467 2. Steven Bird , Ewan Klein , and Edward Loper . 2009. Natural language processing with Python: analyzing text with the natural language toolkit . ” O’Reilly Media , Inc .”. Steven Bird, Ewan Klein, and Edward Loper. 2009. Natural language processing with Python: analyzing text with the natural language toolkit. ” O’Reilly Media, Inc.”. 3. & others Chollet F. 2015. Keras. https://github.com/fchollet/keras & others Chollet F. 2015. Keras. https://github.com/fchollet/keras 4. Linguistic feature based learning model for fake news detection and classification 5. BerConvoNet: A deep learning framework for fake news classification
|
|