Affiliation:
1. Suzhou University of Science and Technology
2. Soochow University, Suzhou, Jiangsu, China
Abstract
Text summarization is one of the significant tasks of natural language processing, which automatically converts text into a summary. Some summarization systems, for short/long English, and short Chinese text, benefit from advances in the neural encoder-decoder model because of the availability of large datasets. However, the long Chinese text summarization research has been limited to datasets of a couple of hundred instances. This article aims to explore the long Chinese text summarization task. To begin with, we construct a first large-scale, long Chinese text summarization corpus, the Long Chinese Summarization of Police Inquiry Record Text (LCSPIRT). Based on this corpus, we propose a sequence-to-sequence (Seq2Seq) model that incorporates a global encoding process with an attention mechanism. Our model achieves a competitive result on the LCSPIRT corpus compared with several benchmark methods.
Funder
National Natural Science Foundation of China
Science 8 Technology Development Project of Suzhou
Innovative Team of Jiangsu Province
Publisher
Association for Computing Machinery (ACM)
Cited by
12 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献