Comparative Analysis of Transformer based Language Models-Reference-Cited by-同舟云学术

Comparative Analysis of Transformer based Language Models

Published:2021-01-23 Issue: Volume: Page:
ISSN:
Container-title:Computer Science & Information Technology (CS & IT)
language:
Short-container-title:

Author:

Pathak Aman

Abstract

Natural language processing (NLP) has witnessed many substantial advancements in the past three years. With the introduction of the Transformer and self-attention mechanism, language models are now able to learn better representations of the natural language. These attentionbased models have achieved exceptional state-of-the-art results on various NLP benchmarks. One of the contributing factors is the growing use of transfer learning. Models are pre-trained on unsupervised objectives using rich datasets that develop fundamental natural language abilities that are fine-tuned further on supervised data for downstream tasks. Surprisingly, current researches have led to a novel era of powerful models that no longer require finetuning. The objective of this paper is to present a comparative analysis of some of the most influential language models. The benchmarks of the study are problem-solving methodologies, model architecture, compute power, standard NLP benchmark accuracies and shortcomings.

Publisher

AIRCC Publishing Corporation

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Adverse Event Signal Detection Using Patients’ Concerns in Pharmaceutical Care Records: Evaluation of Deep Learning Models;Journal of Medical Internet Research;2024-04-16

2. Adverse Event Signal Detection Using Patients’ Concerns in Pharmaceutical Care Records: Evaluation of Deep Learning Models (Preprint);2023-12-25

3. Adverse event signal extraction from cancer patients’ narratives focusing on impact on their daily-life activities;Scientific Reports;2023-09-19

4. BERT-based models’ impact on machine reading comprehension in Hindi and Tamil;2022 4th International Conference on Advances in Computing, Communication Control and Networking (ICAC3N);2022-12-16