Validating an Automated System of Evaluation for English Drafting Skill: A Case of Large-scale, High-stakes Selection of Entry-level Managerial Positions-Reference-Cited by-同舟云学术

Validating an Automated System of Evaluation for English Drafting Skill: A Case of Large-scale, High-stakes Selection of Entry-level Managerial Positions

Published:2023-06 Issue:1 Volume:22 Page:18-27
ISSN:0972-6225
Container-title:Metamorphosis: A Journal of Management Research
language:en
Short-container-title:Metamorphosis: A Journal of Management Research

Author:

Konar A. K.¹,Martin Ayesha¹^ORCID,Ningthoujam Sombala¹

Affiliation:

1. Institute of Banking Personnel Selection, Mumbai, Maharashtra, India

Abstract

The effectiveness of using a constructed response measure for assessing drafting ability is proven and has been used extensively in managerial selection in state-owned organizations. With the advent of online recruitment trends and technology-enhanced assessments, automated scoring has been conceived as a replacement for human scoring with the purpose of emulating the human scoring system. In the context of large-scale writing assessments, automated scoring could provide superior results to human scoring in terms of validity and reliability. In the present study, an attempt has been made to validate an automated essay scoring (AES) algorithm. The study was conducted on a sample of 11,497 randomly selected from a population of 54,392 shortlisted candidates for a national-level examination for the selection of entry-level executives in managerial positions in state-owned banks and a state-owned insurance company. The evaluation of the descriptive (constructed response) component of the examination (English composition) was carried out parallelly by four expert human raters and the AES algorithm. The parameters for evaluation were devised and made available to raters and the algorithm. Data were analysed using mean SD and Pearson correlation coefficient. Results show that the mean scores of human expert raters ( M = 14.648) and automated algorithm method ( M = 15.804) were similar. Further analysis was also undertaken to check the convergent validity of the features used in the algorithm by examining the relation of algorithm scores with sub scores from the objective test of the same construct. Results indicate a significant correlation. It can thus be said that the algorithm scoring method developed can be considered a replacement in that it can complement human expert raters in the evaluation of descriptive papers with consistent scoring and fairness, without inherent biases of inter-rater and intra-rater variation, in addition to practical benefits of speed and cost.

Publisher

SAGE Publications

Subject

General Earth and Planetary Sciences,General Environmental Science

Link

http://journals.sagepub.com/doi/pdf/10.1177/09726225231173054

Reference48 articles.

1. Going Online with Assessment: Putting the Science of Assessment to the Test of Client Need and 21st Century Technologies

2. Open-Ended Versus Multiple-Choice Response Formats—It Does Make a Difference for Diagnostic Purposes

3. Bridgeman B, Trapani C. The question of validity of automated essay scores and differentially valued evidence. Paper presented at: Annual Meeting of the National Council of Measurement in Education, New Orleans, LA; 2011. Cited in: Yan D, Rupp AA, Foltz PW, editors. Handbook of Automated Scoring: Theory into Practice. Boca Raton, FL: CRC Press; 2020.