Contrastive Language-knowledge Graph Pre-training-Reference-Cited by-同舟云学术

Contrastive Language-knowledge Graph Pre-training

Published:2024-04-15 Issue:4 Volume:23 Page:1-21
ISSN:2375-4699
Container-title:ACM Transactions on Asian and Low-Resource Language Information Processing
language:en
Short-container-title:ACM Trans. Asian Low-Resour. Lang. Inf. Process.

Author:

Yuan Xiaowei¹^ORCID,Liu Kang¹^ORCID,Wang Yequan²^ORCID

Affiliation:

1. The Laboratory of Cognition and Decision Intelligence for Complex Systems, Institute of Automation, CAS, School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing Academy of Artificial Intelligence, Beijing, China

2. Beijing Academy of Artificial Intelligence, Beijing, China

Abstract

Recent years have witnessed a surge of academic interest in knowledge-enhanced pre-trained language models (PLMs) that incorporate factual knowledge to enhance knowledge-driven applications. Nevertheless, existing studies primarily focus on shallow, static, and separately pre-trained entity embeddings, with few delving into the potential of deep contextualized knowledge representation for knowledge incorporation. Consequently, the performance gains of such models remain limited. In this article, we introduce a simple yet effective knowledge-enhanced model, College ( Co ntrastive L anguage-Know le dge G raph Pr e -training), which leverages contrastive learning to incorporate factual knowledge into PLMs. This approach maintains the knowledge in its original graph structure to provide the most available information and circumvents the issue of heterogeneous embedding fusion. Experimental results demonstrate that our approach achieves more effective results on several knowledge-intensive tasks compared to previous state-of-the-art methods. Our code and trained models are available at https://github.com/Stacy027/COLLEGE .

Funder

National Key R&D Program of China

National Science Foundation of China

Publisher

Association for Computing Machinery (ACM)

Link

https://dl.acm.org/doi/pdf/10.1145/3644820

Reference61 articles.

1. Antoine Bordes, Nicolas Usunier, Alberto García-Durán, Jason Weston, and Oksana Yakhnenko. 2013. Translating embeddings for modeling multi-relational data. In 27th Annual Conference on Neural Information Processing Systems. 2787–2795.

2. Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language models are few-shot learners. In Annual Conference on Neural Information Processing Systems (NeurIPS’20).

3. Zewen Chi, Li Dong, Furu Wei, Nan Yang, Saksham Singhal, Wenhui Wang, Xia Song, Xian-Ling Mao, Heyan Huang, and Ming Zhou. 2021. InfoXLM: An information-theoretic framework for cross-lingual language model pre-training. In Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT’21). 3576–3588.

4. Eunsol Choi, Omer Levy, Yejin Choi, and Luke Zettlemoyer. 2018. Ultra-fine entity typing. In 56th Annual Meeting of the Association for Computational Linguistics (ACL’18). 87–96.

5. Andrew M. Dai and Quoc V. Le. 2015. Semi-supervised sequence learning. In Annual Conference on Neural Information Processing Systems. 3079–3087.