A qualitative and quantitative comparison between Web scraping and API methods for Twitter credibility analysis-Reference-Cited by-同舟云学术

A qualitative and quantitative comparison between Web scraping and API methods for Twitter credibility analysis

Published:2021-08-03 Issue:6 Volume:17 Page:580-606
ISSN:1744-0084
Container-title:International Journal of Web Information Systems
language:en
Short-container-title:IJWIS

Author:

Dongo Irvin,Cardinale Yudith,Aguilera Ana,Martinez Fabiola,Quintero Yuni,Robayo German,Cabeza David

Abstract

Purpose This paper aims to perform an exhaustive revision of relevant and recent related studies, which reveals that both extraction methods are currently used to analyze credibility on Twitter. Thus, there is clear evidence of the need of having different options to extract different data for this purpose. Nevertheless, none of these studies perform a comparative evaluation of both extraction techniques. Moreover, the authors extend a previous comparison, which uses a recent developed framework that offers both alternates of data extraction and implements a previously proposed credibility model, by adding a qualitative evaluation and a Twitter-Application Programming Interface (API) performance analysis from different locations. Design/methodology/approach As one of the most popular social platforms, Twitter has been the focus of recent research aimed at analyzing the credibility of the shared information. To do so, several proposals use either Twitter API or Web scraping to extract the data to perform the analysis. Qualitative and quantitative evaluations are performed to discover the advantages and disadvantages of both extraction methods. Findings The study demonstrates the differences in terms of accuracy and efficiency of both extraction methods and gives relevance to much more problems related to this area to pursue true transparency and legitimacy of information on the Web. Originality/value Results report that some Twitter attributes cannot be retrieved by Web scraping. Both methods produce identical credibility values when a robust normalization process is applied to the text (i.e. tweet). Moreover, concerning the time performance, Web scraping is faster than Twitter API and it is more flexible in terms of obtaining data; however, Web scraping is very sensitive to website changes. Additionally, the response time of the Twitter API is proportional to the distance from the central server at San Francisco.

Publisher

Emerald

Subject

Computer Networks and Communications,Information Systems

Reference46 articles.

1. Olfinder: finding opinion leaders in online social networks;Journal of Information Science,2015

2. An experimental system for measuring the credibility of news content in twitter;International Journal of Web Information Systems,2011

3. Credfinder: a real-time tweets credibility assessing system,2016

4. A credibility analysis system for assessing information on twitter;IEEE Transactions on Dependable and Secure Computing,2016

5. Credibility in online social networks: a survey;IEEE Access,2019

Cited by 6 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Physical layer security for IoT over Nakagami-m and mixed Rayleigh–Nakagami-m fading channels;Wireless Networks;2023-06-18

2. It is an online platform and not the real world, I don’t care much: Investigating Twitter Profile Credibility With an Online Machine Learning-Based Tool;Proceedings of the 2023 Conference on Human Information Interaction and Retrieval;2023-03-19

3. CrediBot: Applying Bot Detection for Credibility Analysis on Twitter;IEEE Access;2023

4. Credibility Analysis on Twitter Considering Topic Detection;Applied Sciences;2022-09-09

5. A study on application programming interface recommendation: state-of-the-art techniques, challenges and future directions;Library Hi Tech;2022-08-18