A Primer on Contrastive Pretraining in Language Processing: Methods, Lessons Learned, and Perspectives-Reference-Cited by-同舟云学术

A Primer on Contrastive Pretraining in Language Processing: Methods, Lessons Learned, and Perspectives

Published:2023-02-02 Issue:10 Volume:55 Page:1-17
ISSN:0360-0300
Container-title:ACM Computing Surveys
language:en
Short-container-title:ACM Comput. Surv.

Author:

Rethmeier Nils¹^ORCID,Augenstein Isabelle²^ORCID

Affiliation:

1. German Research Center for AI, Berlin, Germany, University of Copenhagen, Denmark, Berlin, Germany

2. University of Copenhagen, Copenhagen, Denmark

Abstract

Modern natural language processing (NLP) methods employ self-supervised pretraining objectives such as masked language modeling to boost the performance of various downstream tasks. These pretraining methods are frequently extended with recurrence, adversarial, or linguistic property masking. Recently, contrastive self-supervised training objectives have enabled successes in image representation pretraining by learning to contrast input-input pairs of augmented images as either similar or dissimilar. In NLP however, a single token augmentation can invert the meaning of a sentence during input-input contrastive learning, which led to input-output contrastive approaches that avoid the issue by instead contrasting over input-label pairs. In this primer, we summarize recent self-supervised and supervised contrastive NLP pretraining methods and describe where they are used to improve language modeling, zero to few-shot learning, pretraining data-efficiency, and specific NLP tasks. We overview key contrastive learning concepts with lessons learned from prior research and structure works by applications. Finally, we point to open challenges and future directions for contrastive NLP to encourage bringing contrastive NLP pretraining closer to recent successes in image representation pretraining.

Publisher

Association for Computing Machinery (ACM)

Subject

General Computer Science,Theoretical Computer Science

Link

https://dl.acm.org/doi/pdf/10.1145/3561970

Reference66 articles.

1. Words Aren’t Enough, Their Order Matters: On the Robustness of Grounding Visual Referring Expressions

2. On Losses for Modern Language Models

3. A Unifying Mutual Information View of Metric Learning: Cross-Entropy vs. Pairwise Losses

4. Tiffany Tianhui Cai Jonathan Frankle David J. Schwab and Ari S. Morcos. 2020. Are All Negatives Created Equal in Contrastive Instance Discrimination? Retrieved from https://arXiv:2010.06682.

5. MixText: Linguistically-Informed Interpolation of Hidden Space for Semi-Supervised Text Classification

Cited by 27 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Attribute mining multi-view contrastive learning network for recommendation;Expert Systems with Applications;2024-11

2. TCLNet: Turn-level contrastive learning network with reranking for dialogue state tracking;Knowledge-Based Systems;2024-10

3. Multivariate graph neural networks on enhancing syntactic and semantic for aspect-based sentiment analysis;Applied Intelligence;2024-08-30

4. Beyond "Taming Electric Scooters": Disentangling Understandings of Micromobility Naturalistic Riding;Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies;2024-08-22

5. Text semantic matching algorithm based on the introduction of external knowledge under contrastive learning;International Journal of Machine Learning and Cybernetics;2024-07-24