Affiliation:
1. Ghana Communication Technology University, Ghana
2. University of Mines and Technology, Tarkwa, Ghana
Abstract
The study sought to statistically compare separate sets of scores of graded essays generated from an automated essay scoring (AES) system (ChatGPT) and a human grader, and further engage stakeholders (students, lecturers, and university management) in a discussion of the results of the analysis from the perspective of fairness, bias, consistency with human grading, ethical issues, and adoption. The study adopted a sequential explanatory mixed methods design. The quantitative approach involved the collection and analysis of essay scores while the qualitative approach involved the use of interviews to ascertain stakeholder opinions of the quantitative results. The results of the quantitative study showed that the distribution of ChatGPT scores is the same across categories of age, gender, and ethnicity. Also, there was no statistically significant difference between ChatGPT scores and the scores of the human grader. The analysis of the responses from the interviews are thoroughly discussed.