Abstract
Abstract
Text augmentation plays an important role in enhancing the generalization performance of language models. However, traditional methods often overlook the unique roles that individual words play in conveying meaning in text and imbalance class distribution, thereby risking suboptimal performance and compromising the model's generalization ability. This limitation motivated us to create a novel technique, Text Augmentation with Word Contributions (TAWC). Our approach tackles this problem in two core steps: Firstly, it employs analytical correlation and semantic similarity metrics to discern the relationships between words and their associated aspect polarities; and secondly, it tailors distinct augmentation strategies to individual words, based on their identified functional contributions within the text. Extensive experiments on two aspect-based sentiment analysis datasets reveal that TAWC significantly improves the classification performance of popular language models, achieving gains of up to 4%, thereby setting a new standard in the field of text augmentation.
Publisher
Research Square Platform LLC
Reference66 articles.
1. Taylor, Luke and Nitschke, Geoff (2018) {Improving Deep Learning with Generic Data Augmentation}. IEEE, 1542--1547, nov, 10.1109/SSCI.2018.8628742, 2018 IEEE Symposium Series on Computational Intelligence (SSCI)
2. Santoso, Noviyanti and Mendon{\c{c}}a, Israel and Aritsugi, Masayoshi (2023) Text Augmentation Based on Integrated Gradients Attribute Score for Aspect-based Sentiment Analysis. 10.1109/BigComp57234.2023.00044, 227-234, , , 2023 IEEE International Conference on Big Data and Smart Computing (BigComp)
3. Santoso, Noviyanti and Mendon{\c{c}}a, Israel and Aritsugi, Masayoshi (2023) {Text Augmentation Based on Integrated Gradients Attribute Score for Aspect-based Sentiment Analysis}. 227--234, feb, ABSA, 978-1-6654-7578-5, 10.1109/BigComp57234.2023.00044, 2023 IEEE International Conference on Big Data and Smart Computing (BigComp)
4. Tobin, Josh and Fong, Rachel and Ray, Alex and Schneider, Jonas and Zaremba, Wojciech and Abbeel, Pieter (2017) {Domain randomization for transferring deep neural networks from simulation to the real world}. IEEE, 23--30, sep, ABSA, 978-1-5386-2682-5, 10.1109/IROS.2017.8202133, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
5. Wenlin Wang and Zhe Gan and Wenqi Wang and Dinghan Shen and Jiaji Huang and Wei Ping and Sanjeev Satheesh and Lawrence Carin (2017) Topic Compositional Neural Language Model. CoRR abs/1712.09783 https://doi.org/10.48550/arXiv.1712.09783, http://arxiv.org/abs/1712.09783