Affiliation:
1. Research Center for Social Computing and Information Retrieval, Harbin Institute of Technology
Abstract
Recent advances, such as GPT and BERT, have shown success in incorporating a pre-trained transformer language model and fine-tuning operation to improve downstream NLP systems. However, this framework still has some fundamental problems in effectively incorporating supervised knowledge from other related tasks. In this study, we investigate a transferable BERT (TransBERT) training framework, which can transfer not only general language knowledge from large-scale unlabeled data but also specific kinds of knowledge from various semantically related supervised tasks, for a target task. Particularly, we propose utilizing three kinds of transfer tasks, including natural language inference, sentiment classification, and next action prediction, to further train BERT based on a pre-trained model. This enables the model to get a better initialization for the target task. We take story ending prediction as the target task to conduct experiments. The final result, an accuracy of 91.8%, dramatically outperforms previous state-of-the-art baseline methods. Several comparative experiments give some helpful suggestions on how to select transfer tasks to improve BERT.
Publisher
International Joint Conferences on Artificial Intelligence Organization
Cited by
13 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Automated Multi-Choice Question Answering System using Natural Language Processing;2024 3rd International Conference for Innovation in Technology (INOCON);2024-03-01
2. A sentiment analysis method for COVID-19 network comments integrated with semantic concept;Engineering Applications of Artificial Intelligence;2024-02
3. Two-stage fine-tuning for Low-resource English-based Creole with Pre-Trained LLMs;2023 IEEE Asia-Pacific Conference on Computer Science and Data Engineering (CSDE);2023-12-04
4. A-CAP: Anticipation Captioning with Commonsense Knowledge;2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR);2023-06
5. Incorporating BERT With Probability-Aware Gate for Spoken Language Understanding;IEEE/ACM Transactions on Audio, Speech, and Language Processing;2023