Affiliation:
1. Centre for Research in Applied Measurement and Evaluation, University of Alberta, Canada
2. Measurement, Evaluation, and Data Science, University of Alberta, Canada
Abstract
Abstract: In this study, we present three types of unsupervised anomaly detection to identify anomalous test-takers based on their action sequences in problem-solving tasks. The first method relies on the use of the Isolation Forest algorithm to detect anomalous test-takers based on raw action sequences extracted from process data. The second method transforms raw action sequences into contextual embeddings using the Bidirectional Encoder Representations from Transformers (BERT) model and then applies the Isolation Forest algorithm to detect anomalous test-takers. The third method follows the same procedure as the second method, but it includes an intermediary step of dimensionality reduction for the contextual embeddings before applying the Isolation Forest algorithm for detecting anomalous cases. To compare the outcomes of the three methods, we analyze the log files from test-takers in the US sample ( n = 2,021) who completed the problem-solving in technology-rich environments (PSTRE) section of the Programme for the International Assessment of Adult Competencies (PIAAC) 2012 assessment. The results indicated that different groups of test-takers were flagged as anomalous depending on the representation (raw action sequences vs. contextual embeddings) and dimensionality of action sequences. Also, when the contextual embeddings were used, a larger number of test-takers were flagged by the Isolation Forest algorithm, indicating the sensitivity of this algorithm to the dimensionality of input data.
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献