Affiliation:
1. University College of University, India
2. Stanley College of Engineering and Technology for Women, India
Abstract
With the rapid growth of digital content, there is a need for an automatic text summarizer to provide short text from a long text document. Many research works have been presented for extractive text summarization (ETS). This article mainly focuses on the graph-based ETS approach for multiple Telugu text documents. A modified Text-Rank algorithm is employed with the noun and verb count of each sentence in the text as the initial score of each node. To get the optimal features, a novel feature selection algorithm called improved Flamingo Search Algorithm is proposed in this article. Though graph-based ETS is an important approach, the generated summaries are redundant. To reduce the redundancy in the generated summary, maximum marginal relevance is combined with the modified Text-Rank. Different word-embedding techniques such as Fast-Text, Word2vec, TF-IDF, and one-hot encoding are utilized to experiment with the proposed approach. The performance of the proposed text summarization approach is evaluated with BLEU and ROUGE in terms of F-measure, precision, and recall.
Publisher
Association for Computing Machinery (ACM)