Affiliation:
1. State Key Lab of Software Development Environment, Beihang University, Beijing 100191, China
2. School of Journalism, Communication University of China, Beijing 100024, China
Abstract
Supervised learning methods excel in traditional relation extraction tasks. However, the quality and scale of the training data heavily influence their performance. Few-shot relation extraction is gradually becoming a research hotspot whose objective is to learn and extract semantic relationships between entities with only a limited number of annotated samples. In recent years, numerous studies have employed prototypical networks for few-shot relation extraction. However, these methods often suffer from overfitting of the relation classes, making it challenging to generalize effectively to new relationships. Therefore, this paper seeks to utilize a diffusion model for data augmentation to address the overfitting issue of prototypical networks. We propose a diffusion model-enhanced prototypical network framework. Specifically, we design and train a controllable conditional relation generation diffusion model on the relation extraction dataset, which can generate the corresponding instance representation according to the relation description. Building upon the trained diffusion model, we further present a pseudo-sample-enhanced prototypical network, which is able to provide more accurate representations for prototype classes, thereby alleviating overfitting and better generalizing to unseen relation classes. Additionally, we introduce a pseudo-sample-aware attention mechanism to enhance the model’s adaptability to pseudo-sample data through a cross-entropy loss, further improving the model’s performance. A series of experiments are conducted to prove our method’s effectiveness. The results indicate that our proposed approach significantly outperforms existing methods, particularly in low-resource one-shot environments. Further ablation analyses underscore the necessity of each module in the model. As far as we know, this is the first research to employ a diffusion model for enhancing the prototypical network through data augmentation in few-shot relation extraction.
Funder
Fundamental Research Funds for the Central Universities
Reference43 articles.
1. Deep neural network-based relation extraction: An overview;Wang;Neural Comput. Appl.,2022
2. Xu, J., Chen, Y., Qin, Y., Huang, R., and Zheng, Q. (2021). A feature combination-based graph convolutional neural network model for relation extraction. Symmetry, 13.
3. PTCAS: Prompt tuning with continuous answer search for relation extraction;Chen;Inf. Sci.,2024
4. Mintz, M., Bills, S., Snow, R., and Jurafsky, D. (2009, January 2–7). Distant supervision for relation extraction without labeled data. Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, Singapore.
5. Generalizing from a few examples: A survey on few-shot learning;Wang;ACM Comput. Surv. (CSUR),2020