Item Difficulty Prediction Using Item Text Features: Comparison of Predictive Performance across Machine-Learning Algorithms-Reference-Cited by-同舟云学术

Item Difficulty Prediction Using Item Text Features: Comparison of Predictive Performance across Machine-Learning Algorithms

Published:2023-09-28 Issue:19 Volume:11 Page:4104
ISSN:2227-7390
Container-title:Mathematics
language:en
Short-container-title:Mathematics

Author:

Štěpánek Lubomír¹²^ORCID,Dlouhá Jana¹³^ORCID,Martinková Patrícia¹⁴^ORCID

Affiliation:

1. Institute of Computer Science of the Czech Academy of Sciences, 182 07 Prague, Czech Republic

2. First Faculty of Medicine, Charles University, 121 08 Prague, Czech Republic

3. Faculty of Arts, Charles University, 116 38 Prague, Czech Republic

4. Faculty of Education, Charles University, 110 00 Prague, Czech Republic

Abstract

This work presents a comparative analysis of various machine learning (ML) methods for predicting item difficulty in English reading comprehension tests using text features extracted from item wordings. A wide range of ML algorithms are employed within both the supervised regression and the classification tasks, including regularization methods, support vector machines, trees, random forests, back-propagation neural networks, and Naïve Bayes; moreover, the ML algorithms are compared to the performance of domain experts. Using f-fold cross-validation and considering the root mean square error (RMSE) as the performance metric, elastic net outperformed other approaches in a continuous item difficulty prediction. Within classifiers, random forests returned the highest extended predictive accuracy. We demonstrate that the ML algorithms implementing item text features can compete with predictions made by domain experts, and we suggest that they should be used to inform and improve these predictions, especially when item pre-testing is limited or unavailable. Future research is needed to study the performance of the ML algorithms using item text features on different item types and respondent populations.

Funder

Czech Science Foundation

RVO

Charles University

Publisher

MDPI AG

Subject

General Mathematics,Engineering (miscellaneous),Computer Science (miscellaneous)

Link

https://www.mdpi.com/2227-7390/11/19/4104/pdf

Reference88 articles.

1. Martinková, P., and Hladká, A. (2023). Computational Aspects of Psychometric Methods: With R, CRC Press.