Affiliation:
1. Politecnico di Torino, Italy
2. Institute for Cancer Research at Candiolo (IRCC), Italy
Abstract
With the diffusion of online newspapers and social media, users are becoming capable of retrieving dozens of news articles covering the same topic in a short time. News article summarization is the task of automatically selecting a worthwhile subset of news' sentences that users could easily explore. Promising research directions in this field are the use of semantics-based models (e.g., ontologies and taxonomies) to identify key document topics and the integration of social data analysis to also consider the current user's interests during summary generation. The chapter overviews the most recent research advances in document summarization and presents a novel strategy to combine ontology-based and social knowledge for addressing the problem of generic (not query-based) multi-document summarization of news articles. To identify the most salient news articles' sentences, an ontology-based text analysis is performed during the summarization process. Furthermore, the social content acquired from real Twitter messages is separately analyzed to also consider the current interests of social network users for sentence evaluation. The combination of ontological and social knowledge allows the generation of accurate and easy-to-read news summaries. Moreover, the proposed summarizer performs better than the evaluated competitors on real news articles and Twitter messages.