Using Automated Procedures to Score Educational Essays Written in Three Languages-Reference-Cited by-同舟云学术

Using Automated Procedures to Score Educational Essays Written in Three Languages

Published:2024-07-22 Issue: Volume: Page:
ISSN:0022-0655
Container-title:Journal of Educational Measurement
language:en
Short-container-title:J Educational Measurement

Author:

Firoozi Tahereh¹^ORCID,Mohammadi Hamid¹,Gierl Mark J.¹

Affiliation:

1. University of Alberta

Abstract

AbstractThe purpose of this study is to describe and evaluate a multilingual automated essay scoring (AES) system for grading essays in three languages. Two different sentence embedding models were evaluated within the AES system, multilingual BERT (mBERT) and language‐agnostic BERT sentence embedding (LaBSE). German, Italian, and Czech essays were holistically scored using the Common European Framework of Reference of Languages. The AES system with mBERT produced results that were consistent with human raters overall across all three language groups. The system also produced accurate predictions for some but not all of the score levels within each language. The AES system with LaBSE produced results that were even more consistent with the human raters overall across all three language groups compared to mBERT. In addition, the system produced accurate predictions for the majority of the score levels within each language. The performance differences between mBERT and LaBSE can be explained by considering how each language embedding model is implemented. Implications of this study for educational testing are also discussed.

Publisher

Wiley

Link

https://onlinelibrary.wiley.com/doi/pdf/10.1111/jedm.12406

Reference40 articles.

1. Abadi M. Agarwal A. Barham P. Brevdo E. Chen Z. Citro C. &Zheng X.(2015).Tensorflow: Large‐scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467.

2. Cross-Validation

3. A novel automated essay scoring approach for reliable higher educational assessments