Abstract
Abstract
Text classification is a basic task in natural language processing. When the amount of data is insufficient, the classification accuracy will be greatly affected. We propose to use the back-translation method to expand three Chinese data sets used for text classification, and then train and predict the data sets through deep learning classification model. The results prove that using back-translation to expand the data is particularly helpful on a smaller dataset, it also can reduce the unbalanced distribution of samples and improve the classification performance.
Subject
General Physics and Astronomy
Reference17 articles.
1. Understanding Data Augmentation for Classification: When to Warp?;Wong,2016
2. ImageNet classification with deep convolutional neural networks;Krizhevsky;Commun. ACM,2017
3. Machine learning in automated text categorization;Sebastiani;ACM Computing Surveys,2002
4. Do not have enough data? Deep learning to the rescue!;Anabytavor,2020
5. The OL-DAWE Model: Tweet Polarity Sentiment Analysis With Data Augmentation;Wang;IEEE Access,2020
Cited by
12 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Predicting the locations of missing persons in China by using NGO data and deep learning techniques;International Journal of Digital Earth;2024-01-17
2. Text Classification Based on Multilingual Back-Translation and Model Ensemble;Communications in Computer and Information Science;2024
3. Easy Data Augmentation for Handling Imbalanced Data in Fake News Detection;2023 International Conference on Technology, Engineering, and Computing Applications (ICTECA);2023-12-20
4. Back Translation-EDA and Transformer for Hate Speech Classification in Indonesian;2023 International Conference on Informatics, Multimedia, Cyber and Informations System (ICIMCIS);2023-11-07
5. Test case classification via few-shot learning;Information and Software Technology;2023-08