To Score or Not to Score: Factors Influencing Performance and Feasibility of Automatic Content Scoring of Text Responses-Reference-Cited by-同舟云学术

To Score or Not to Score: Factors Influencing Performance and Feasibility of Automatic Content Scoring of Text Responses

Published:2023-02-14 Issue:1 Volume:42 Page:44-58
ISSN:0731-1745
Container-title:Educational Measurement: Issues and Practice
language:en
Short-container-title:Educational Measurement

Author:

Zesch Torsten¹^ORCID,Horbach Andrea¹,Zehner Fabian²³

Affiliation:

1. CATALPA FernUniversität in Hagen

2. DIPF | Leibniz Institute for Research and Information in Education

3. Centre for International Student Assessment (ZIB)

Abstract

AbstractIn this article, we systematize the factors influencing performance and feasibility of automatic content scoring methods for short text responses. We argue that performance (i.e., how well an automatic system agrees with human judgments) mainly depends on the linguistic variance seen in the responses and that this variance is indirectly influenced by other factors such as target population or input modality. Extending previous work, we distinguish conceptual, realization, and nonconformity variance, which are differentially impacted by the various factors. While conceptual variance relates to different concepts embedded in the text responses, realization variance refers to their diverse manifestation through natural language. Nonconformity variance is added by aberrant response behavior. Furthermore, besides its performance, the feasibility of using an automatic scoring system depends on external factors, such as ethical or computational constraints, which influence whether a system with a given performance is accepted by stakeholders. Our work provides (i) a framework for assessment practitioners to decide a priori whether automatic content scoring can be successfully applied in a given setup as well as (ii) new empirical findings and the integration of empirical findings from the literature on factors that influence automatic systems' performance.

Publisher

Wiley

Subject

Education

Link

https://onlinelibrary.wiley.com/doi/pdf/10.1111/emip.12544

Reference82 articles.

1. shinyReCoR: A Shiny Application for Automatically Coding Text Responses Using R

2. Andersen N. Zehner F. &Goldhammer F.(in print).Semi‐automatic coding of open‐ended text responses in large‐scale assessments.Journal of Computer Assisted Learning.https://doi.org/10.1111/jcal.12717

3. APA(2017).Ethical principles of psychologists and code of conduct (2002 amended effective June 1 2010 and January 1 2017). Retrieved fromhttps://www.apa.org/ethics/code[2020‐03‐23].

4. Algorithmic Bias in Education

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. From the Automated Assessment of Student Essay Content to Highly Informative Feedback: a Case Study;International Journal of Artificial Intelligence in Education;2024-01-25

2. A Method of Computer Automatic Scoring for Subjective Questions;2023 2nd International Conference on Artificial Intelligence, Human-Computer Interaction and Robotics (AIHCIR);2023-12-08