Survey of transformers and towards ensemble learning using transformers for natural language processing-Reference-Cited by-同舟云学术

Survey of transformers and towards ensemble learning using transformers for natural language processing

Published:2024-02-04 Issue:1 Volume:11 Page:
ISSN:2196-1115
Container-title:Journal of Big Data
language:en
Short-container-title:J Big Data

Author:

Zhang Hongzhi,Shafiq M. Omair

Abstract

AbstractThe transformer model is a famous natural language processing model proposed by Google in 2017. Now, with the extensive development of deep learning, many natural language processing tasks can be solved by deep learning methods. After the BERT model was proposed, many pre-trained models such as the XLNet model, the RoBERTa model, and the ALBERT model were also proposed in the research community. These models perform very well in various natural language processing tasks. In this paper, we describe and compare these well-known models. In addition, we also apply several types of existing and well-known models which are the BERT model, the XLNet model, the RoBERTa model, the GPT2 model, and the ALBERT model to different existing and well-known natural language processing tasks, and analyze each model based on their performance. There are a few papers that comprehensively compare various transformer models. In our paper, we use six types of well-known tasks, such as sentiment analysis, question answering, text generation, text summarization, name entity recognition, and topic modeling tasks to compare the performance of various transformer models. In addition, using the existing models, we also propose ensemble learning models for the different natural language processing tasks. The results show that our ensemble learning models perform better than a single classifier on specific tasks. Graphical Abstract

Funder

Carleton University

Natural Sciences and Engineering Research Council of Canada

Publisher

Springer Science and Business Media LLC

Link

https://link.springer.com/content/pdf/10.1186/s40537-023-00842-0.pdf

Reference55 articles.

1. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I. Attention is all you need. Adv neural Inf Process Syst. 2017;30.

2. Vajjala S, Majumder B, Gupta A, Surana H. Practical natural language processing: a comprehensive guide to building real-world NLP systems. O'Reilly Media; 2020.

3. Devlin J, Chang M, Lee K, Toutanova K. BERT: pre-training of deep bidirectional transformers for language understanding. North American Chapter of the Association for Computational Linguistics; 2019.

4. Radford A, et al. Language models are unsupervised multitask learners. OpenAI blog. 2019;1(8):9.

5. Yang Z, Dai Z, Yang Y, Carbonell J, Salakhutdinov RR, Le QV. Xlnet: Generalized autoregressive pretraining for language understanding. Adv neural Inf Process Syst. 2019;32.

Cited by 6 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Artificial Intelligence-Based Underwater Acoustic Target Recognition: A Survey;Remote Sensing;2024-09-08

2. A new weighted ensemble model-based method for text implication recognition;Multimedia Tools and Applications;2024-07-10

3. Artificial Intelligence in Newborn Medicine;Newborn;2024-06-21

4. An ensemble approach for classification of diabetic retinopathy in fundus image;Multimedia Tools and Applications;2024-05-16

5. Tweeting the Startup Journey: How Twitter Data Outlines Startup Life Cycle Phases;2024