Affiliation:
1. University of Electronic Science and Technology of China, China
2. Beijing University of Posts and Telecommunications, China
Abstract
In this work, we propose a novel method to summarize popular information from massive tourism blog data. First, we crawl blog contents and segment them into semantic word vectors separately. Then, we select the geographical terms in each word vector into a corresponding geographical term vector and present a new method to explore hot tourism locations and, in particular, their frequent sequential relations from a set of geographical term vectors. Third, we propose a novel word vector subdividing method to collect local features for each hot location, and introduce the metric of max-confidence to identify the Things of Interest (ToI) associated with the location from the collected data. We illustrate the benefits of this approach by applying it to a Chinese online tourism blog dataset. The experimental results show that the proposed method can be used to explore hot locations, as well as their sequential relations and corresponding ToI, efficiently.
Subject
Library and Information Sciences,Information Systems
Cited by
20 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献