Affiliation:
1. IBM T. J. Watson Research Center, Yorktown Heights, NY
Abstract
The large amount of text data which are continuously produced over time in a variety of large scale applications such as social networks results in massive streams of data. Typically massive text streams are created by very large scale interactions of individuals, or by structured creations of particular kinds of content by dedicated organizations. An example in the latter category would be the massive text streams created by news-wire services. Such text streams provide unprecedented challenges to data mining algorithms from an efficiency perspective. In this paper, we review text stream mining algorithms for a wide variety of problems in data mining such as clustering, classification and topic modeling. A recent challenge arises in the context of
social streams
, which are generated by large social networks such as
Twitter
. We also discuss a number of future challenges in this area of research.
Publisher
Association for Computing Machinery (ACM)
Reference64 articles.
1. Event Detection in Social Streams
2. C. C. Aggarwal and K. Subbian. Evolutionary Network Analysis: A Survey ACM Computing Surveys accepted to appear 2014. 10.1145/2601412 C. C. Aggarwal and K. Subbian. Evolutionary Network Analysis: A Survey ACM Computing Surveys accepted to appear 2014. 10.1145/2601412
3. On demand classification of data streams
4. On clustering massive text and categorical data streams
Cited by
16 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. A Novel Neural Ensemble Architecture for On-The-Fly Classification of Evolving Text Streams;ACM Transactions on Knowledge Discovery from Data;2023-12-28
2. A Two-Phase Framework for Detecting Manipulation Campaigns in Social Media;Social Computing and Social Media. Design, Ethics, User Behavior, and Social Network Analysis;2020
3. Text Deduplication with Minimum Loss Ratio;Proceedings of the 2019 11th International Conference on Machine Learning and Computing - ICMLC '19;2019
4. Graph-Based Clustering Approach for Economic and Financial Event Detection Using News Analytics Data;Lecture Notes in Computer Science;2018
5. Social Stream Classification with Emerging New Labels;Advances in Knowledge Discovery and Data Mining;2018