A semi-supervised short text sentiment classification method based on improved Bert model from unlabelled data-Reference-Cited by-同舟云学术

A semi-supervised short text sentiment classification method based on improved Bert model from unlabelled data

Published:2023-03-15 Issue:1 Volume:10 Page:
ISSN:2196-1115
Container-title:Journal of Big Data
language:en
Short-container-title:J Big Data

Author:

Zou Haochen,Wang Zitao

Abstract

AbstractShort text information has considerable commercial value and immeasurable social value. Natural language processing and short text sentiment analysis technology can organize and analyze short text information on the Internet. Natural language processing tasks such as sentiment classification have achieved satisfactory performance under a supervised learning framework. However, traditional supervised learning relies on large-scale and high-quality manual labels and obtaining high-quality label data costs a lot. Therefore, the strong dependence on label data hinders the application of the deep learning model to a large extent, which is the bottleneck of supervised learning. At the same time, short text datasets such as product reviews have an imbalance in the distribution of data samples. To solve the above problems, this paper proposes a method to predict label data according to semi-supervised learning mode and implements the MixMatchNL data enhancement method. Meanwhile, the Bert pre-training model is updated. The cross-entropy loss function in the model is improved to the Focal Loss function to alleviate the data imbalance in short text datasets. Experimental results based on public datasets indicate the proposed model has improved the accuracy of short text sentiment recognition compared with the previous update and other state-of-the-art models.

Publisher

Springer Science and Business Media LLC

Subject

Information Systems and Management,Computer Networks and Communications,Hardware and Architecture,Information Systems

Link

https://link.springer.com/content/pdf/10.1186/s40537-023-00710-x.pdf

Reference51 articles.

1. Boyd D, Golder S, Lotan G. Tweet, tweet, retweet: conversational aspects of retweeting on twitter. In: 2010 43rd Hawaii international conference on system sciences. New York: IEEE; 2010. p. 1–10.

2. Roy G, Debnath R, Mitra PS, Shrivastava AK. Analytical study of low-income consumers’ purchase behaviour for developing marketing strategy. Int J Syst Assurance Eng Manag. 2021;12(5):895–909.

3. Cambria E, Schuller B, Xia Y, Havasi C. New avenues in opinion mining and sentiment analysis. IEEE Intell Syst. 2013;28(2):15–21.

4. Lin H-CK, Wang T-H, Lin G-C, Cheng S-C, Chen H-R, Huang Y-M. Applying sentiment analysis to automatically classify consumer comments concerning marketing 4cs aspects. Appl Soft Comput. 2020;97:106755.

5. Jagtap V, Pawar K. Analysis of different approaches to sentence-level sentiment classification. Int J Sci Eng Technol. 2013;2(3):164–70.

Cited by 7 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Sentiment-based predictive models for online purchases in the era of marketing 5.0: a systematic review;Journal of Big Data;2024-08-05

2. A novel deep learning model for detection of inconsistency in e-commerce websites;Neural Computing and Applications;2024-03-16

3. Sentiment Analysis of Hotel Reviews based on BERT and XGBoost;2024 3rd International Conference on Computer Technologies (ICCTech);2024-02-01

4. Enhancing Sentiment Analysis Accuracy in Borobudur Temple Visitor Reviews through Semi-Supervised Learning and SMOTE Upsampling;Journal of Advances in Information Technology;2024

5. Pseudo-Labeling With Large Language Models for Multi-Label Emotion Classification of French Tweets;IEEE Access;2024