Abstract
As an emerging research field, more and more researchers are turning their attention to multimodal entity linking (MEL). However, previous works always focus on obtaining joint representations of mentions and entities and then determining the relationship between mentions and entities by these representations. This means that their models are often very complex and will result in ignoring the relationship between different modal information from different corpus. To solve the above problems, we proposed a paradigm of pre-training and fine-tuning for MEL. We designed three different categories of NSP tasks for pre-training, i.e., mixed-modal, text-only and multimodal and doubled the amount of data for pre-training by swapping the roles of sentences in NSP. Our experimental results show that our model outperforms other baseline models and our pre-training strategies all contribute to the improvement of the results. In addition, our pre-training gives the final model a strong generalization capability that performs well even on smaller amounts of data.
Funder
Science and Technology Project of State Grid Corporation of China
Subject
Electrical and Electronic Engineering,Computer Networks and Communications,Hardware and Architecture,Signal Processing,Control and Systems Engineering
Reference23 articles.
1. Neural entity linking: A survey of models based on deep learning
2. Linking Documents to Encyclopedic Knowledge
3. Learning Dynamic Context Augmentation for Global Entity Linking;Yang;arXiv,2008
4. Building a Multimodal Entity Linking Dataset From Tweets;Adjali;Proceedings of the 12th Language Resources and Evaluation Conference,2020
5. Multimodal Entity Linking for Tweets;Adjali,2020
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献