A Flexible Big Data System for Credibility-Based Filtering of Social Media Information According to Expertise-Reference-Cited by-同舟云学术

A Flexible Big Data System for Credibility-Based Filtering of Social Media Information According to Expertise

Published:2024-04-15 Issue:1 Volume:17 Page:
ISSN:1875-6883
Container-title:International Journal of Computational Intelligence Systems
language:en
Short-container-title:Int J Comput Intell Syst

Author:

Diaz-Garcia Jose A.^ORCID,Gutiérrez-Batista Karel,Fernandez-Basso Carlos,Ruiz M. Dolores,Martin-Bautista Maria J.

Abstract

AbstractNowadays, social networks have taken on an irreplaceable role as sources of information. Millions of people use them daily to find out about the issues of the moment. This success has meant that the amount of content present in social networks is unmanageable and, in many cases, fake or non-credible. Therefore, a correct pre-processing of the data is necessary if we want to obtain knowledge and value from these data sets. In this paper, we propose a new data pre-processing technique based on Big Data that seeks to solve two of the key concepts of the Big Data paradigm, data validity and credibility of the data and volume. The system is a Spark-based filter that allows us to flexibly select credible users related to a given topic under analysis, reducing the volume of data and keeping only valid data for the problem under study. The proposed system uses the power of word embeddings in conjunction with other text mining and natural language processing techniques. The system has been validated using three real-world use cases.

Funder

Junta de Andalucía

European Union NextGenerationEU / PRTR.

Ministerio de Ciencia e Innovación

Vicerrectorado de Investigación y Transferencia, Universidad de Granada

Ministerio de Educación, Cultura y Deporte

Publisher

Springer Science and Business Media LLC

Link

https://link.springer.com/content/pdf/10.1007/s44196-024-00483-y.pdf

Reference35 articles.

1. Perrin, A.: Social media usage. Pew Res. Center 125, 52–68 (2015)

2. Batrinca, B., Treleaven, P.C.: Social media analytics: a survey of techniques, tools and platforms. Ai & Society 30(1), 89–116 (2015)

3. Li, S., Liu, F., Zhang, Y., Zhu, B., Zhu, H., Yu, Z.: Text mining of user-generated content (ugc) for business applications in e-commerce: a systematic review. Mathematics 10(19) (2022). https://doi.org/10.3390/math10193554

4. Assefi, M., Behravesh, E., Liu, G., Tafti, A.P.: Big data machine learning using apache spark mllib. IEEE Int. Conf. Big Data (Big Data) 2017, 3492–3498 (2017). https://doi.org/10.1109/BigData.2017.8258338

5. Diaz-Garcia, J. A., Ruiz M. D., Martin-Bautista, M. J.: A comparative study of word embeddings for the construction of a social media expert filter. In: International Conference on Flexible Query Answering Systems. Springer, 196–208 (2021)