Affiliation:
1. Winston Churchill High School, Potomac, MD, USA
2. University of Maryland, College Park, USA
Abstract
Cheating detection in large-scale assessment received considerable attention in the extant literature. However, none of the previous studies in this line of research investigated the stacking ensemble machine learning algorithm for cheating detection. Furthermore, no study addressed the issue of class imbalance using resampling. This study explored the application of the stacking ensemble machine learning algorithm to analyze the item response, response time, and augmented data of test-takers to detect cheating behaviors. The performance of the stacking method was compared with that of two other ensemble methods (bagging and boosting) as well as six base non-ensemble machine learning algorithms. Issues related to class imbalance and input features were addressed. The study results indicated that stacking, resampling, and feature sets including augmented summary data generally performed better than its counterparts in cheating detection. Compared with other competing machine learning algorithms investigated in this study, the meta-model from stacking using discriminant analysis based on the top two base models—Gradient Boosting and Random Forest—generally performed the best when item responses and the augmented summary statistics were used as the input features with an under-sampling ratio of 10:1 among all the study conditions.
Subject
Applied Mathematics,Applied Psychology,Developmental and Educational Psychology,Education
Reference28 articles.
1. Anguita D., Ghelardoni L., Ghio A., Oneto L., Ridella S. (2012). The “K” in K-fold cross validation. In ESANN 2012 proceedings, European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (pp. 25–27). https://www.esann.org/sites/default/files/proceedings/legacy/es2012-62.pdf
2. Bishop S., Egan K. (2017). Detecting erasures and unusual gain scores. In Cizek G. J., Wollack J. A. (Eds.), Handbook of quantitative methods for detecting cheating on tests (pp. 193–213). Routledge. https://doi.org/10.4324/9781315743097-10
3. SMOTE: Synthetic Minority Over-sampling Technique
4. Chen Y., Lu Y., Moustaki I. (2020). Detection of two-way outliers in multivariate data and application to cheating detection in educational tests. arXiv:1911.09408. https://arxiv.org/abs/1911.09408v2
Cited by
18 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献