Research of news text classification method based on hierarchical semantics and prior correction-Reference-Cited by-同舟云学术

Research of news text classification method based on hierarchical semantics and prior correction

Published:2024-04-18 Issue:4 Volume:46 Page:8185-8203
ISSN:1064-1246
Container-title:Journal of Intelligent & Fuzzy Systems
language:
Short-container-title:IFS

Author:

Sun Ping¹,Song LinLin²,Yuan Ling²,Yu Haiping¹,Wei Yinzhen¹

Affiliation:

1. Wuhan Vocational College of Software and Engineering, Wuhan, Hubei, China

2. School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei, China

Abstract

News text is an important branch of natural language processing. Compared to ordinary texts, news text has significant economic and scientific value. The characteristics of news text include structural hierarchy, diverse label categories, and limited high-quality annotation samples. Many machine learning and deep learning methods exist to analyze various forms of news text. However, due to label imbalance, hierarchical semantics, and confusing labels, current methods have limitations. Therefore, this paper proposes a news text classification framework based on hierarchical semantics and prior correction (HSPC). Firstly, data augmentation is used to enhance the diversity of the training set and adversarial learning is employed to improve the resistance of the model with its robustness. Then, a hierarchical feature extraction approach is employed to extract semantic features from different levels of news texts. Consequentially, a feature fusion method is designed to allow the model to focus on relevant hierarchical semantics for label classification. Finally, highly confusing label predictions are corrected to optimize the label prediction of the model and improve confidence. Multiple experiments are performed on four widely used public datasets. The experimental results indicate that HSPC achieves higher classification accuracy compared to other models. On the FCT, AGNews, THUCNews, and Ohsumed datasets, HSPC improves the accuracy by 1.03%, 1.38%, 2.55%, and 1.15%, respectively, compared to state-of-the-art methods. This validates the rationality and effectiveness of the designed mechanisms.

Publisher

IOS Press

Reference19 articles.

1. A Dirichlet process term-based mixture model for short text stream clustering;Chen;Appl Intell,2020

2. An adaptive LDA optimal topic number selection method in news topic identification;Zheng;IEEE Access,2023

3. Personalized news recommendation: methods and challenges;Wu;ACM Trans Inf Syst,2023

4. Sentiment Analysis for Arabic Social Media News Polarity;Hnaif;Intelligent Automation & Soft Computing,2021

5. ML-KNN: A lazy learning approach to multi-label learning;Zhang;Pattern Recognition,2007