Y-Rank: A Multi-Feature-Based Keyphrase Extraction Method for Short Text-Reference-Cited by-同舟云学术

Y-Rank: A Multi-Feature-Based Keyphrase Extraction Method for Short Text

Published:2024-03-16 Issue:6 Volume:14 Page:2510
ISSN:2076-3417
Container-title:Applied Sciences
language:en
Short-container-title:Applied Sciences

Author:

Liu Qiang¹^ORCID,Hui Yan¹^ORCID,Liu Shangdong¹^ORCID,Ji Yimu¹

Affiliation:

1. School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing 210023, China

Abstract

Keyphrase extraction is a critical task in text information retrieval, which traditionally employs both supervised and unsupervised approaches. Supervised methods generally rely on large corpora, which introduce the problems of availability, while unsupervised methods are independent of out-sources but also lead to defects like imperfect statistical features or low accuracy. Particularly in short-text scenarios, limited text features often result in low-quality candidate ranking. To address this issue, this paper proposes Y-Rank, a lightweight unsupervised keyphrase extraction method that extracts the average information content of candidate sentences as the key statistical features from a single document, and follows a graph construction approach based on similarity to obtain the semantic features of keyphrase with high-quality and ranking accuracy. Finally, the top-ranked keyphrases are acquired by the fusion of these features. The experimental results on five datasets illustrate that Y-Rank outperforms the other nine unsupervised methods, achieves enhancements on six accuracy metrics, including Precision, Recall, F-Measure, MRR, MAP, and Bpref, and performs the highest improvement in short text scenarios.

Funder

National Key R&D Program of China

Jiangsu Key Development Planning Project

Natural Science Foundation of Jiangsu Province

The 14th Five-Year Plan project of Equipment Development Department

Jiangsu Hongxin Information Technology Co., Ltd. Project

Future Network Scientific Research Fund Project

NUPTSF

Publisher

MDPI AG

Link

https://www.mdpi.com/2076-3417/14/6/2510/pdf

Reference53 articles.

1. Sun, C., Hu, L., Li, S., Li, T., Li, H., and Chi, L. (2020). A review of unsupervised keyphrase extraction methods using within-collection resources. Symmetry, 12.

2. Lv, S., Guo, D., Xu, J., Tang, D., Duan, N., Gong, M., Shou, L., Jiang, D., Cao, G., and Hu, S. (2020, January 7–12). Graph-based reasoning over heterogeneous external knowledge for commonsense question answering. Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA.

3. Yang, H., Sanner, S., Wu, G., and Zhou, J.P. (2021, January 21–25). Bayesian Preference Elicitation with Keyphrase-Item Coembeddings for Interactive Recommendation. Proceedings of the 29th ACM Conference on User Modeling, Utrecht, The Netherlands.

4. Automatic clustering algorithms: A systematic review and bibliometric analysis of relevant literature;Ezugwu;Neural Comput. Appl.,2021

5. Zhou, C., Shang, J., Zhang, J., Li, Q., and Hu, D. (2021, January 7–10). Topic-Attentive Encoder-Decoder with Pre-Trained Language Model for Keyphrase Generation. Proceedings of the 2021 IEEE International Conference on Data Mining (ICDM), Auckland, New Zealand.