Document Summarization Using Sentence-Level Semantic Based on Word Embeddings-Reference-Cited by-同舟云学术

Document Summarization Using Sentence-Level Semantic Based on Word Embeddings

Published:2019-02 Issue:02 Volume:29 Page:177-196
ISSN:0218-1940
Container-title:International Journal of Software Engineering and Knowledge Engineering
language:en
Short-container-title:Int. J. Soft. Eng. Knowl. Eng.

Author:

Al-Sabahi Kamal¹,Zuping Zhang¹

Affiliation:

1. School of Information Science and Engineering, Central South University, Changsha, Hunan 410083, P. R. China

Abstract

In the era of information overload, text summarization has become a focus of attention in a number of diverse fields such as, question answering systems, intelligence analysis, news recommendation systems, search results in web search engines, and so on. A good document representation is the key point in any successful summarizer. Learning this representation becomes a very active research in natural language processing field (NLP). Traditional approaches mostly fail to deliver a good representation. Word embedding has proved an excellent performance in learning the representation. In this paper, a modified BM25 with Word Embeddings are used to build the sentence vectors from word vectors. The entire document is represented as a set of sentence vectors. Then, the similarity between every pair of sentence vectors is computed. After that, TextRank, a graph-based model, is used to rank the sentences. The summary is generated by picking the top-ranked sentences according to the compression rate. Two well-known datasets, DUC2002 and DUC2004, are used to evaluate the models. The experimental results show that the proposed models perform comprehensively better compared to the state-of-the-art methods.

Publisher

World Scientific Pub Co Pte Lt

Subject

Artificial Intelligence,Computer Graphics and Computer-Aided Design,Computer Networks and Communications,Software

Link

https://www.worldscientific.com/doi/pdf/10.1142/S0218194019500086

Reference17 articles.

1. Recent automatic text summarization techniques: a survey

2. An Adaptive Semantic Descriptive Model for Multi-Document Representation to Enhance Generic Summarization

3. An Enhanced Latent Semantic Analysis Approach for Arabic Document Summarization

4. PRST: A PageRank-Based Summarization Technique for Summarizing Bug Reports with Duplicates

Cited by 6 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Automatic Text Summarization Method Based on Improved TextRank Algorithm and K-Means Clustering;Knowledge-Based Systems;2024-03

2. Impact of Similarity Measures in Graph-based Automatic Text Summarization of Konkani Texts;ACM Transactions on Asian and Low-Resource Language Information Processing;2023-02-21

3. Malay lexical simplification model for non-native speaker;2022 International Conference on Intelligent Systems and Computer Vision (ISCV);2022-05-18

4. Exploiting Semantic Term Relations in Text Summarization;International Journal of Information Retrieval Research;2022-01

5. Similar case matching with explicit knowledge-enhanced text representation;Applied Soft Computing;2020-10