Short Text Clustering Algorithms for Weibo Topic Detection-Reference-Cited by-同舟云学术

Short Text Clustering Algorithms for Weibo Topic Detection

Published:2014-06 Issue: Volume:971-973 Page:1747-1751
ISSN:1662-8985
Container-title:Advanced Materials Research
language:
Short-container-title:AMR

Author:

Zhang Lei,Chen Hai Qiang,Li Wei Jie,Liu Yan Zhao,Wu Run Pu

Abstract

Text clustering is a popular research topic in the field of text mining, and now there are a lot of text clustering methods catering to different application requirements. Currently, Weibo data acquisition is through the API provided by big microblogging platforms. In this essay, we will discuss the algorithm of extracting popular topics posted by Weibo users by text clustering after massive data collection. Due to the fact that traditional text analysis may not be applicable to short texts used in Weibo, text clustering shall be carried out through combining multiple posts into long texts, based on their features (forwards, comments and followers, etc.). Either frequency-based or density-based short text clustering can deliver in most cases. The former is applicable to find hot topics from large Weibo short texts, and the latter is applicable to find abnormal contents. Both the two methods use semantic information to improve the accuracy of clustering. Besides, they improve the performance of clustering through the parallelism.

Publisher

Trans Tech Publications, Ltd.

Subject

General Engineering

Link

https://www.scientific.net/AMR.971-973.1747.pdf

Reference1 articles.

1. Jure Leskovec, John Shawe-Taylor. Semantic Text Features from Small World Graphs. Subspace, Latent Structure and Feature Selection techniques： Statistical and Optimization perspectives Workshop . (2005).

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. The short texts classification based on neural network topic model;Journal of Intelligent & Fuzzy Systems;2022-02-02

2. Review of intelligent microblog short text processing;Web Intelligence;2016-08-04