Abstract
AbstractTopic modeling aims to discover latent themes in collections of text documents. It has various applications across fields such as sociology, opinion analysis, and media studies. In such areas, it is essential to have easily interpretable, diverse, and coherent topics. An efficient topic modeling technique should accurately identify flat and hierarchical topics, especially useful in disciplines where topics can be logically arranged into a tree format. In this paper, we propose Community Topic, a novel algorithm that exploits word co-occurrence networks to mine communities and produces topics. We also evaluate the proposed approach using several metrics and compare it with usual baselines, confirming its good performances. Community Topic enables quick identification of flat topics and topic hierarchy, facilitating the on-demand exploration of sub- and super-topics. It also obtains good results on datasets in different languages.
Funder
Natural Sciences and Engineering Research Council of Canada
Canadian Institute for Advanced Research
Alberta Machine Intelligence Institute
Publisher
Springer Science and Business Media LLC
Reference93 articles.
1. Aletras, N., Stevenson, M (2013) Evaluating topic coherence using distributional semantics. In: Proceedings of the 10th International Conference on Computational Semantics (IWCS 2013)–Long Papers. pp 13–22
2. Angelov D (2020) Top2vec: distributed representations of topics. arXiv preprint arXiv:2008.09470
3. Aynaud T, Blondel VD, Guillaume J-L, Lambiotte R (2013) Multilevel local optimization of modularity. In: Graph partitioning. Wiley, pp 315–345
4. Bianchi F, Terragni S, Hovy D (2020) Pre-training is a hot topic: contextualized document embeddings improve topic coherence. arXiv preprint arXiv:2004.03974
5. Blei D, Lafferty J (2006) Correlated topic models. Adv Neural Inf Process Syst 18:147
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献