Abstract
This article describes a new method for generating extractive summaries directly via unigram and bigram extraction techniques. The methodology uses the selective part of speech tagging to extract significant unigrams and bigrams from a set of sentences. Extracted unigrams and bigrams along with other features are used to build a final summary. A new selective rule-based part of speech tagging system is developed that concentrates on the most important parts of speech for summarizations: noun, verb, and adjective. Other parts of speech such as prepositions, articles, adverbs, etc., play a lesser role in determining the meaning of sentences; therefore, they are not considered when choosing significant unigrams and bigrams. The proposed method is tested on two problem domains: citations and opinosis data sets. Results show that the proposed method performs better than Text-Rank, LexRank, and Edmundson summarization methods. The proposed method is general enough to summarize texts from any domain.
Reference32 articles.
1. A rule-based approach for tagging nonvocalized Arabic words.;A.Al-Taani;The International Arab Journal of Information Technology,2009
2. Improving Performance of Text Summarization.;S.Babar;Procedia Computer Science,2015
3. Belica, M. (2014). sumy 0.4.1 is Module for Automatic Summarization of Text Documents and html Pages. Retrieved from http://pydoc.net/Python/sumy/0.4.1/
4. Binwahlan, M. S., Suanmali, L., & Salim, N. (2009). Sentence Features Fusion for Text Summarization using Fuzzy Logic. In Proceedings of the International Conference on Hybrid Intelligent Systems (pp. 142–146). Academic Press.
5. Brill, E. (1992). A simple rule-based part of speech tagger. Proceedings of the Workshop on Speech and Natural Language - HLT ’91 (p. 112). Academic Press.
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献