Transformer-Based Composite Language Models for Text Evaluation and Classification-Reference-Cited by-同舟云学术

Transformer-Based Composite Language Models for Text Evaluation and Classification

Published:2023-11-16 Issue:22 Volume:11 Page:4660
ISSN:2227-7390
Container-title:Mathematics
language:en
Short-container-title:Mathematics

Author:

Škorić Mihailo¹^ORCID,Utvić Miloš²^ORCID,Stanković Ranka¹^ORCID

Affiliation:

1. Faculty of Mining and Geology, University of Belgrade, Djusina 7, 11120 Belgrade, Serbia

2. Faculty of Philology, University of Belgrade, Studentski Trg 3, 11000 Belgrade, Serbia

Abstract

Parallel natural language processing systems were previously successfully tested on the tasks of part-of-speech tagging and authorship attribution through mini-language modeling, for which they achieved significantly better results than independent methods in the cases of seven European languages. The aim of this paper is to present the advantages of using composite language models in the processing and evaluation of texts written in arbitrary highly inflective and morphology-rich natural language, particularly Serbian. A perplexity-based dataset, the main asset for the methodology assessment, was created using a series of generative pre-trained transformers trained on different representations of the Serbian language corpus and a set of sentences classified into three groups (expert translations, corrupted translations, and machine translations). The paper describes a comparative analysis of calculated perplexities in order to measure the classification capability of different models on two binary classification tasks. In the course of the experiment, we tested three standalone language models (baseline) and two composite language models (which are based on perplexities outputted by all three standalone models). The presented results single out a complex stacked classifier using a multitude of features extracted from perplexity vectors as the optimal architecture of composite language models for both tasks.

Funder

Program PRIZMA, the Science Fund of the Republic of Serbia

Publisher

MDPI AG

Subject

General Mathematics,Engineering (miscellaneous),Computer Science (miscellaneous)

Link

https://www.mdpi.com/2227-7390/11/22/4660/pdf

Reference43 articles.

1. Elman, J.L. (1988). Finding Structure in Time. CRL Technical Report 9901, University of California. Technical Report, Center for Research in Language.

2. Finding Structure in Time;Elman;Cogn. Sci.,1990

3. Hochreiter, J.S. (1991). Untersuchungen zu Dynamischen Neuronalen Netzen. [Master’s Thesis, Institut für Informatik Technische Universität München]. Available online: https://people.idsia.ch/~juergen/SeppHochreiter1991ThesisAdvisorSchmidhuber.pdf.

4. Deep Learning;LeCun;Nature,2015

5. Long Short-Term Memory;Hochreiter;Neural Comput.,1997

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Automated Quality Concerns Extraction from User Stories and Acceptance Criteria for Early Architectural Decisions;Lecture Notes in Computer Science;2024