Abstract
This paper presents a framework that can automatically analyze the images and comments in user-uploaded location databases. The proposed framework integrates image processing and natural language processing techniques to perform scene classification, data cleaning, and comment summarization so that the cluttered information in user-uploaded databases can be presented in an organized way to users. For scene classification, RGB image features, segmentation features, and the features of discriminative objects are fused with an attention module to improve classification accuracy. For data cleaning, incorrect images are detected using a multilevel feature extractor and a multiresolution distance calculation scheme. Finally, a comment summarization scheme is proposed to overcome the problems of unstructured sentences and the improper usage of punctuation marks, which are commonly found in customer reviews. To validate the proposed framework, a system that can classify and organize scenes and comments for hotels is implemented and evaluated. Comparisons with existing related studies are also performed. The experimental results validate the effectiveness and superiority of the proposed framework.
Subject
Electrical and Electronic Engineering,Computer Networks and Communications,Hardware and Architecture,Signal Processing,Control and Systems Engineering
Reference36 articles.
1. A Review on the Significance of Machine Learning for Data Analysis in Big Data
2. Survey on evaluating the performance of machine learning algorithms: Past contributions and future roadmap;Basha,2019
3. ImageNet: A large-scale hierarchical image database;Deng;Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2009
4. Places: A 10 Million Image Database for Scene Recognition
5. Learning two-pathway convolutional neural networks for categorizing scene images