Author:
Bijani Houman,Hashempour Bahareh,Ibrahim Khaled Ahmed Abdel-Al,Orabah Salim Said Bani,Heydarnejad Tahereh
Abstract
AbstractDue to subjectivity in oral assessment, much concentration has been put on obtaining a satisfactory measure of consistency among raters. However, the process for obtaining more consistency might not result in valid decisions. One matter that is at the core of both reliability and validity in oral assessment is rater training. Recently, multifaceted Rasch measurement (MFRM) has been adopted to address the problem of rater bias and inconsistency in scoring; however, no research has incorporated the facets of test takers’ ability, raters’ severity, task difficulty, group expertise, scale criterion category, and test version together in a piece of research along with their two-sided impacts. Moreover, little research has investigated how long rater training effects last. Consequently, this study explored the influence of the training program and feedback by having 20 raters score the oral production produced by 300 test-takers in three phases. The results indicated that training can lead to more degrees of interrater reliability and diminished measures of severity/leniency, and biasedness. However, it will not lead the raters into total unanimity, except for making them more self-consistent. Even though rater training might result in higher internal consistency among raters, it cannot simply eradicate individual differences related to their characteristics. That is, experienced raters, due to their idiosyncratic characteristics, did not benefit as much as inexperienced ones. This study also showed that the outcome of training might not endure in long term after training; thus, it requires ongoing training throughout the rating period letting raters regain consistency.
Publisher
Springer Science and Business Media LLC
Subject
Linguistics and Language,Language and Linguistics
Reference44 articles.
1. Ahmadi, A. (2019). A study of raters’ behavior in scoring l2 speaking performance: Using rater discussion as a training tool. Issues in Language Teaching, 8(1), 195–224. https://doi.org/10.22054/ILT.2020.49511.461.
2. Ahmadian, M., Mehri, E., & Ghaslani, R. (2019). The effect of direct, indirect, and negotiated feedback on the tense/aspect of EFL learners in writing. Issues in Language Teaching, 8(1), 1–32. https://doi.org/10.22054/ILT.2020.37680.352.
3. Bachman, L. F., & Palmer, A. S. (1996). Language testing in practice: Designing and developing useful language tests. Oxford University Press.
4. Bijani, H. (2010). Raters’ perception and expertise in evaluating second language compositions. The Journal of Applied Linguistics, 3(2), 69–89.
5. Bijani, H., & Fahim, M. (2011). The effects of rater training on raters’ severity and bias analysis in second language writing. Iranian Journal of Language Testing, 1(1), 1–16.
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献