From Word Embeddings to Pre-Trained Language Models: A State-of-the-Art Walkthrough-Reference-Cited by-同舟云学术

From Word Embeddings to Pre-Trained Language Models: A State-of-the-Art Walkthrough

Published:2022-09-01 Issue:17 Volume:12 Page:8805
ISSN:2076-3417
Container-title:Applied Sciences
language:en
Short-container-title:Applied Sciences

Author:

Mars Mourad^ORCID

Abstract

With the recent advances in deep learning, different approaches to improving pre-trained language models (PLMs) have been proposed. PLMs have advanced state-of-the-art (SOTA) performance on various natural language processing (NLP) tasks such as machine translation, text classification, question answering, text summarization, information retrieval, recommendation systems, named entity recognition, etc. In this paper, we provide a comprehensive review of prior embedding models as well as current breakthroughs in the field of PLMs. Then, we analyse and contrast the various models and provide an analysis of the way they have been built (number of parameters, compression techniques, etc.). Finally, we discuss the major issues and future directions for each of the main points.

Funder

UQU DSR

Publisher

MDPI AG

Subject

Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science

Link

https://www.mdpi.com/2076-3417/12/17/8805/pdf

Reference90 articles.

1. Distributional Structure

2. A Neural Probabilistic Language Model;Bengio;J. Mach. Learn. Res.,2003

3. Distributed representations of words and phrases and their compositionality;Mikolov;Proceedings of the 26th International Conference on Neural Information Processing Systems,2013

4. Glove: Global vectors for word representation;Pennington;Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP2014),2014

5. Neural Word Embedding as Implicit Matrix Factorization;Levy,2014

Cited by 39 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Framework for automation of short answer grading based on domain-specific pre-training;Engineering Applications of Artificial Intelligence;2024-11

2. LLM4VV: Developing LLM-driven testsuite for compiler validation;Future Generation Computer Systems;2024-11

3. Bias in candidate sourcing communication: Investigating stereotypical gender- and age-related frames in online job advertisements at the sectoral level;Public Relations Review;2024-09

4. Unmasking large language models by means of OpenAI GPT-4 and Google AI: A deep instruction-based analysis;Intelligent Systems with Applications;2024-09

5. Large language models (LLMs): survey, technical frameworks, and future challenges;Artificial Intelligence Review;2024-08-18