Multilingual Autoregressive Entity Linking

Author:

De Cao Nicola12,Wu Ledell3,Popat Kashyap4,Artetxe Mikel5,Goyal Naman6,Plekhanov Mikhail7,Zettlemoyer Luke89,Cancedda Nicola10,Riedel Sebastian1112,Petroni Fabio13

Affiliation:

1. Facebook AI, UK.

2. University of Amsterdam, NL. nicola.decao@gmail.com

3. Beijing Academy of Artificial Intelligence, China. wuyu.ledell@gmail.com; ledell@fb.com

4. Facebook AI, UK. kpopat@fb.com

5. Facebook AI, UK. artetxe@fb.com

6. Facebook AI, USA. naman@fb.com

7. Facebook AI, UK. movb@fb.com

8. Facebook AI, USA

9. University of Washington, USA. lsz@fb.com

10. Facebook AI, UK. ncan@fb.com

11. Facebook AI, UK

12. University College London, UK. sriedel@fb.com

13. Facebook AI, UK. fabiopetroni@fb.com

Abstract

Abstract We present mGENRE, a sequence-to- sequence system for the Multilingual Entity Linking (MEL) problem—the task of resolving language-specific mentions to a multilingual Knowledge Base (KB). For a mention in a given language, mGENRE predicts the name of the target entity left-to-right, token-by-token in an autoregressive fashion. The autoregressive formulation allows us to effectively cross-encode mention string and entity names to capture more interactions than the standard dot product between mention and entity vectors. It also enables fast search within a large KB even for mentions that do not appear in mention tables and with no need for large-scale vector indices. While prior MEL works use a single representation for each entity, we match against entity names of as many languages as possible, which allows exploiting language connections between source input and target name. Moreover, in a zero-shot setting on languages with no training data at all, mGENRE treats the target language as a latent variable that is marginalized at prediction time. This leads to over 50% improvements in average accuracy. We show the efficacy of our approach through extensive evaluation including experiments on three popular MEL benchmarks where we establish new state-of-the-art results. Source code available at https://github.com/facebookresearch/GENRE.

Publisher

MIT Press - Journals

Subject

Artificial Intelligence,Computer Science Applications,Linguistics and Language,Human-Computer Interaction,Communication

Reference45 articles.

1. Learning to retrieve reasoning paths over wikipedia graph for question answering;Asai,2020

2. Giusepppe Attardi . 2015. Wikiextractor. https://github.com/attardi/wikiextractor

3. Freebase: A collaboratively created graph database for structuring human knowledge;Bollacker,2008

4. Learning end-to-end goal- oriented dialog;Bordes,2017

5. Entity linking in 100 languages;Botha,2020

Cited by 13 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. An entity-centric approach to manage court judgments based on Natural Language Processing;Computer Law & Security Review;2024-04

2. Adaptive deep learning for entity disambiguation via knowledge-based risk analysis;Expert Systems with Applications;2024-03

3. A method for constructing a machining knowledge graph using an improved transformer;Expert Systems with Applications;2024-03

4. BioPRO: Context-Infused Prompt Learning for Biomedical Entity Linking;IEEE/ACM Transactions on Audio, Speech, and Language Processing;2024

5. Effective Entity Disambiguation in Low-Resource Languages: A Study of Icelandic;2023 IEEE International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT);2023-10-26

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3