Leveraging Prompt and Top-K Predictions with ChatGPT Data Augmentation for Improved Relation Extraction-Reference-Cited by-同舟云学术

Leveraging Prompt and Top-K Predictions with ChatGPT Data Augmentation for Improved Relation Extraction

Published:2023-11-28 Issue:23 Volume:13 Page:12746
ISSN:2076-3417
Container-title:Applied Sciences
language:en
Short-container-title:Applied Sciences

Author:

Feng Ping¹²³⁴⁵^ORCID,Wu Hang⁶,Yang Ziqian⁶,Wang Yunyi⁶,Ouyang Dantong¹^ORCID

Affiliation:

1. College of Computer Science and Technology, Jilin University, Changchun 130012, China

2. College of Computer Science and Technology, Changchun University, Changchun 130022, China

3. Ministry of Education Key Laboratory of Intelligent Rehabilitation and Barrier-Free Access for the Disabled, Changchun 130022, China

4. Jilin Provincial Key Laboratory of Human Health State Identification and Function Enhancement, Changchun 130022, China

5. Jilin Rehabilitation Equipment and Technology Engineering Research Center for the Disabled, Changchun 130022, China

6. College of Cybersecurity, Changchun University, Changchun 130022, China

Abstract

Relation extraction tasks aim to predict the type of relationship between two entities from a given text. However, many existing methods fail to fully utilize the semantic information and the probability distribution of the output of pre-trained language models, and existing data augmentation approaches for natural language processing (NLP) may introduce errors. To address this issue, we propose a method that introduces prompt information and Top-K prediction sets and utilizes ChatGPT for data augmentation to improve relational classification model performance. First, we add prompt information before each sample and encode the modified samples by pre-training the language model RoBERTa and using these feature vectors to obtain the Top-K prediction set. We add a multi-attention mechanism to link the Top-K prediction set with the prompt information. We then reduce the possibility of introducing noise by bootstrapping ChatGPT so that it can better perform the data augmentation task and reduce subsequent unnecessary operations. Finally, we investigate the predefined relationship categories in the SemEval 2010 Task 8 dataset and the prediction results of the model and propose an entity location prediction task designed to assist the model in accurately determining the relative locations between entities. Experimental results indicate that our model achieves high results on the SemEval 2010 Task 8 dataset.

Funder

Science and Technology Development Plan Project of Jilin Provincial Science and Technology Department

Publisher

MDPI AG

Subject

Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science

Link

https://www.mdpi.com/2076-3417/13/23/12746/pdf

Reference21 articles.

1. Kenton, J.D.M.W.C., and Toutanova, L.K. (2019, January 2–7). Bert: Pre-training of deep bidirectional transformers for language understanding. Proceedings of the NAACL-HLT, Minneapolis, MN, USA.

2. Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., and Stoyanov, V. (2019). Roberta: A robustly optimized bert pretraining approach. arXiv.

3. Radford, A., Narasimhan, K., Salimans, T., and Sutskever, I. (2018). Improving Language Understanding by Generative Pre-Training, OpenAI.

4. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing;Liu;ACM Comput. Surv.,2023

5. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., and Polosukhin, I. (2017, January 4–9). Attention is all you need. Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, Long Beach, CA, USA.

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Unlocking the Potential: A Comprehensive Systematic Review of ChatGPT in Natural Language Processing Tasks;Computer Modeling in Engineering & Sciences;2024