An Improved Multiple Features and Machine Learning-Based Approach for Detecting Clickbait News on Social Networks-Reference-Cited by-同舟云学术

An Improved Multiple Features and Machine Learning-Based Approach for Detecting Clickbait News on Social Networks

Published:2021-10-13 Issue:20 Volume:11 Page:9487
ISSN:2076-3417
Container-title:Applied Sciences
language:en
Short-container-title:Applied Sciences

Author:

Al-Sarem Mohammed,Saeed Faisal^ORCID,Al-Mekhlafi Zeyad Ghaleb^ORCID,Mohammed Badiea Abdulkarem^ORCID,Hadwan Mohammed^ORCID,Al-Hadhrami Tawfik^ORCID,Alshammari Mohammad T.^ORCID,Alreshidi Abdulrahman^ORCID,Alshammari Talal Sarheed

Abstract

The widespread usage of social media has led to the increasing popularity of online advertisements, which have been accompanied by a disturbing spread of clickbait headlines. Clickbait dissatisfies users because the article content does not match their expectation. Detecting clickbait posts in online social networks is an important task to fight this issue. Clickbait posts use phrases that are mainly posted to attract a user’s attention in order to click onto a specific fake link/website. That means clickbait headlines utilize misleading titles, which could carry hidden important information from the target website. It is very difficult to recognize these clickbait headlines manually. Therefore, there is a need for an intelligent method to detect clickbait and fake advertisements on social networks. Several machine learning methods have been applied for this detection purpose. However, the obtained performance (accuracy) only reached 87% and still needs to be improved. In addition, most of the existing studies were conducted on English headlines and contents. Few studies focused specifically on detecting clickbait headlines in Arabic. Therefore, this study constructed the first Arabic clickbait headline news dataset and presents an improved multiple feature-based approach for detecting clickbait news on social networks in Arabic language. The proposed approach includes three main phases: data collection, data preparation, and machine learning model training and testing phases. The collected dataset included 54,893 Arabic news items from Twitter (after pre-processing). Among these news items, 23,981 were clickbait news (43.69%) and 30,912 were legitimate news (56.31%). This dataset was pre-processed and then the most important features were selected using the ANOVA F-test. Several machine learning (ML) methods were then applied with hyper-parameter tuning methods to ensure finding the optimal settings. Finally, the ML models were evaluated, and the overall performance is reported in this paper. The experimental results show that the Support Vector Machine (SVM) with the top 10% of ANOVA F-test features (user-based features (UFs) and content-based features (CFs)) obtained the best performance and achieved 92.16% of detection accuracy.

Funder

University of Hail

Publisher

MDPI AG

Subject

Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science

Link

https://www.mdpi.com/2076-3417/11/20/9487/pdf

Reference24 articles.

1. Clickbait Detection;Potthast,2016

Cited by 9 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Clickbait detection in Hebrew;Lodz Papers in Pragmatics;2023-12-01

2. Identification of clickbait news articles using SBERT and correlation matrix;Social Network Analysis and Mining;2023-11-25

3. An Deep Convolutional Neural Networks are used to Detect Cyberbullying on Social Networks.;2023 4th IEEE Global Conference for Advancement in Technology (GCAT);2023-10-06

4. CA-CD: context-aware clickbait detection using new Chinese clickbait dataset with transfer learning method;Data Technologies and Applications;2023-08-29

5. Application of machine learning for improved surface quality classification in ultra-precision machining of germanium;Journal of Manufacturing Systems;2022-10