Detecting and locating trending places using multimodal social network data-Reference-Cited by-同舟云学术

Detecting and locating trending places using multimodal social network data

Published:2022-12-20 Issue: Volume: Page:
ISSN:1380-7501
Container-title:Multimedia Tools and Applications
language:en
Short-container-title:Multimed Tools Appl

Author:

Lucas Luis^ORCID,Tomás David,Garcia-Rodriguez Jose

Abstract

AbstractThis paper presents a machine learning-based classifier for detecting points of interest through the combined use of images and text from social networks. This model exploits the transfer learning capabilities of the neural network architecture CLIP (Contrastive Language-Image Pre-Training) in multimodal environments using image and text. Different methodologies based on multimodal information are explored for the geolocation of the places detected. To this end, pre-trained neural network models are used for the classification of images and their associated texts. The result is a system that allows creating new synergies between images and texts in order to detect and geolocate trending places that has not been previously tagged by any other means, providing potentially relevant information for tasks such as cataloging specific types of places in a city for the tourism industry. The experiments carried out reveal that, in general, textual information is more accurate and relevant than visual cues in this multimodal setting.

Funder

Conselleria de Innovación, Universidades, Ciencia y Sociedad Digital, Generalitat Valenciana

European Regional Development Fund

Universidad de Alicante

Publisher

Springer Science and Business Media LLC

Subject

Computer Networks and Communications,Hardware and Architecture,Media Technology,Software

Link

https://link.springer.com/content/pdf/10.1007/s11042-022-14296-8.pdf

Reference43 articles.

1. Afyouni I, Aghbari ZA, Razack RA (2022) Multi-feature, multi-modal, and multi-source social event detection: a comprehensive survey. Inf Fusion 79 (2021):279–308. https://doi.org/10.1016/j.inffus.2021.10.013

2. Arora G, Pavani PL, Kohli R, Bibhu V (2016) Multimodal biometrics for improvised security. 2016 1st Int Conf Innovation Challenges in Cyber Secur, ICICCS 2016 (Iciccs):1–5. https://doi.org/10.1109/ICICCS.2016.7542312https://doi.org/10.1109/ICICCS.2016.7542312

3. Chang M-W, Ratinov L, Roth D, Srikumar V (2008) Importance of semantic representation: dataless classification. In: Proceedings of the 23rd national conference on artificial intelligence - vol 2. AAAI’08. AAAI press, pp 830–835

4. Cheng J, Fostiropoulos I, Boehm B, Soleymani M (2021) Multimodal phased transformer for sentiment analysis. EMNLP 2021 - 2021 conference on empirical methods in natural language processing, proceedings, pp 2447–2458. https://doi.org/10.18653/v1/2021.emnlp-main.189

5. Cho J, Lei J, Tan H, Bansal M (2021) Unifying vision-and-language tasks via text generation. arXiv:2102.02779

Cited by 5 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Navigating the Multimodal Landscape: A Review on Integration of Text and Image Data in Machine Learning Architectures;Machine Learning and Knowledge Extraction;2024-07-09

2. Machine learning applied to tourism: A systematic review;WIREs Data Mining and Knowledge Discovery;2024-07-04

3. TourOptiGuide: A Hybrid and Personalized Tourism Recommendation System;2024-05-08

4. Lightweight CNNs for Advanced Bird Species Recognition on the Edge;Lecture Notes in Computer Science;2024

5. Scoping Review on Image-Text Multimodal Machine Learning Models;2023 International Conference on Computational Science and Computational Intelligence (CSCI);2023-12-13