Abstract
AbstractSpotting fake news is a critical problem nowadays. Social media are responsible for propagating fake news. Fake news propagated over digital platforms generates confusion as well as induce biased perspectives in people. Detection of misinformation over the digital platform is essential to mitigate its adverse impact. Many approaches have been implemented in recent years. Despite the productive work, fake news identification poses many challenges due to the lack of a comprehensive publicly available benchmark dataset. There is no large-scale dataset that consists of Indian news only. So, this paper presents IFND (Indian fake news dataset) dataset. The dataset consists of both text and images. The majority of the content in the dataset is about events from the year 2013 to the year 2021. Dataset content is scrapped using the Parsehub tool. To increase the size of the fake news in the dataset, an intelligent augmentation algorithm is used. An intelligent augmentation algorithm generates meaningful fake news statements. The latent Dirichlet allocation (LDA) technique is employed for topic modelling to assign the categories to news statements. Various machine learning and deep-learning classifiers are implemented on text and image modality to observe the proposed IFND dataset's performance. A multi-modal approach is also proposed, which considers both textual and visual features for fake news detection. The proposed IFND dataset achieved satisfactory results. This study affirms that the accessibility of such a huge dataset can actuate research in this laborious exploration issue and lead to better prediction models.
Funder
Council of Science and Technology, U.P.
Publisher
Springer Science and Business Media LLC
Subject
General Earth and Planetary Sciences,General Environmental Science
Cited by
41 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献