Affiliation:
1. Hung Yen University of Technology and Education (UTEHY), Vietnam
2. Japan Advanced Institute of Science and Technology, 1-1 Asahidai, Nomi, Ishikawa, 923-1292, Japan
Abstract
User-generated content such as comments or tweets (also called by social information) following a Web document provides additional information for enriching the content of an event mentioned in sentences. This paper presents a framework named SoSVMRank, which integrates the user-generated content of a Web document to generate a highquality summarization. In order to do that, the summarization was formulated as a learning to rank task, in which comments or tweets are exploited to support sentences in a mutual reinforcement fashion. To model sentence-comment (or tweet) relation, a set of local and social features are proposed. After ranking, top m ranked sentences and comments (or tweets) are selected as the summarization. To validate the efficiency of our framework, sentence and story highlight extraction tasks were taken as a case study on three datasets in two languages, English and Vietnamese. Experimental results indicate that: (i) our new features improve the summary performance of the framework in term of ROUGE-scores compared to state-of-the-art baselines and (ii) the integration of user-generated content benefits single-document summarization.
Publisher
World Scientific Pub Co Pte Lt
Subject
Artificial Intelligence,Artificial Intelligence
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献