Abstract
This paper presents Semantic SentenceRank (SSR), an unsupervised scheme for automatically ranking sentences in a single document according to their relative importance. In particular, SSR extracts essential words and phrases from a text document, and uses semantic measures to construct, respectively, a semantic phrase graph over phrases and words, and a semantic sentence graph over sentences. It applies two variants of article-structure-biased PageRank to score phrases and words on the first graph and sentences on the second graph. It then combines these scores to generate the final score for each sentence. Finally, SSR solves a multi-objective optimization problem for ranking sentences based on their final scores and topic diversity through semantic subtopic clustering. An implementation of SSR that runs in quadratic time is presented, and it outperforms, on the SummBank benchmarks, each individual judge’s ranking and compares favorably with the combined ranking of all judges.
Subject
Artificial Intelligence,Computational Theory and Mathematics,Computer Science Applications,Theoretical Computer Science,Software
Reference66 articles.
1. A survey on automatic text summarization;Das;Literature Survey for the Language and Statistics II Course at CMU,2007
2. Wang J, Zhang H, Zhang C, Yang W, Shao L, Wang J. An effective scheme for generating an overview report over a very large corpus of documents. in: Proceedings of the 19th ACM Symposium on Document Engineering (DocEng 2019); 2019.
3. Neto JL, Santos AD, Kaestner CA, Alexandre N, Santos D, et al. Document clustering and text summarization. 2000.
4. Mihalcea R, Tarau P. Textrank: Bringing order into text. in: Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing; 2004.
5. Bhartiya D, Singh A. A semantic approach to summarization. arXiv preprint arXiv:14061203. 2014.
Cited by
8 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献