Evaluation of Language Models on Romanian XQuAD and RoITD datasets
-
Published:2023-02-09
Issue:1
Volume:18
Page:
-
ISSN:1841-9844
-
Container-title:INTERNATIONAL JOURNAL OF COMPUTERS COMMUNICATIONS & CONTROL
-
language:
-
Short-container-title:INT J COMPUT COMMUN, Int. J. Comput. Commun. Control
Author:
Nicolae Constantin Dragos,Kumar Yadav Rohan,Tufiş Dan
Abstract
Natural language processing (NLP) has become a vital requirement in a wide range of applications, including machine translation, information retrieval, and text classification. The development and evaluation of NLP models for various languages have received significant attention in recent years, but there has been relatively little work done on comparing the performance of different language models on Romanian data. In particular, the introduction and evaluation of various Romanian language models with multilingual models have barely been comparatively studied. In this paper, we address this gap by evaluating eight NLP models on two Romanian datasets, XQuAD and RoITD. Our experiments and results show that bert-base-multilingual-cased and bertbase- multilingual-uncased, perform best on both XQuAD and RoITD tasks, while RoBERT-small model and DistilBERT models perform the worst. We also discuss the implications of our findings and outline directions for future work in this area.
Publisher
Agora University of Oradea
Subject
Computational Theory and Mathematics,Computer Networks and Communications,Computer Science Applications
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献