Features in extractive supervised single-document summarization: case of Persian news-Reference-Cited by-同舟云学术

Features in extractive supervised single-document summarization: case of Persian news

Published:2024-05-08 Issue: Volume: Page:
ISSN:1574-020X
Container-title:Language Resources and Evaluation
language:en
Short-container-title:Lang Resources & Evaluation

Author:

Rezaei Hosein,Mirhosseini Seyed Amid Moeinzadeh,Shahgholian Azar,Saraee Mohamad^ORCID

Abstract

AbstractText summarization has been one of the most challenging areas of research in NLP. Much effort has been made to overcome this challenge by using either abstractive or extractive methods. Extractive methods are preferable due to their simplicity compared with the more elaborate abstractive methods. In extractive supervised single-document approaches, the system will not generate sentences. Instead, via supervised learning, it learns how to score sentences within the document based on some textual features and subsequently selects those with the highest rank. Therefore, the core objective is ranking, which enormously depends on the document structure and context. These dependencies have been unnoticed by many state-of-the-art solutions. In this work, document-related features such as topic and relative length are integrated into the vectors of every sentence to enhance the quality of summaries. Our experiment results show that the system takes contextual and structural patterns into account, which will increase the precision of the learned model. Consequently, our method will produce more comprehensive and concise summaries.

Publisher

Springer Science and Business Media LLC

Link

https://link.springer.com/content/pdf/10.1007/s10579-024-09739-7.pdf

Reference54 articles.

1. Asgarian, E. (2021). “Text-Mining.ir,” [Online]. Available: https://demo.text-mining.ir/Home/Summarization. Accessed: 03-May-2021.

2. Barrera, A., & Verma, R. (2012). Combining syntax and semantics for automatic extractive single-document summarization. In CICLing'12 Proceedings of the 13th International Conference on Computational Linguistics and Intelligent Text Processing—Volume Part II, pp. 366–377.

3. Berenjkoub, M., & Palhang, M. (2012). Persian text summarization using a supervised machine learning approach. In Proceedings of the Robocup IranOpen 2012 Symposium and 2nd Iran's Joint Conference of Robotics and AI, Tehran, Iran.

4. Christensen, J., Soderland, S., & Etzioni, O. (2013). Towards coherent multi-document summarization. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1163–1173.

5. Dlikman, A. (2016). Using machine learning methods and linguistic features in single-document extractive summarization. DMNLP@PKDD/ECML, pp. 1–8.