Affiliation:
1. Shenzhen Graduate School, Harbin Institute of Technology
Abstract
Twitter is widely used all over the world, and a huge number of hot topics are generated by Twitter users in real time. These topics are able to reflect almost every aspect of people’s daily lives. Therefore, the detection of topics in Twitter can be used in many real applications, such as monitoring public opinion, hot product recommendation and incidence detection. However, the performance of traditional topic detection methods is still far from perfect largely owing to the tweets’ features, such as their limited length and arbitrary abbreviations. To address these problems, we propose a novel framework (MVTD) for Twitter topic detection using multiview clustering, which can integrate multirelations among tweets, such as semantic relations, social tag relations and temporal relations. We also propose some methods for measuring relations among tweets. In particular, to better measure the semantic similarity of tweets, we propose a new document similarity measure based on a suffix tree (STVSM). In addition, a new keyword extraction method based on a suffix tree is proposed. Experiments on real datasets show that the performance of MVTD is much better than that of a single view, and it is useful for detecting topics from Twitter.
Subject
Library and Information Sciences,Information Systems
Cited by
50 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献