Evaluating subscore uses across multiple levels: A case of reading and listening subscores for young EFL learners-Reference-Cited by-同舟云学术

Evaluating subscore uses across multiple levels: A case of reading and listening subscores for young EFL learners

Published:2019-10-14 Issue:2 Volume:37 Page:254-279
ISSN:0265-5322
Container-title:Language Testing
language:en
Short-container-title:Language Testing

Author:

Choi Ikkyu¹^ORCID,Papageorgiou Spiros¹

Affiliation:

1. Educational Testing Service, USA

Abstract

Stakeholders of language tests are often interested in subscores. However, reporting a subscore is not always justified; a subscore should provide reliable and distinct information to be worth reporting. When a subscore is used for decisions across multiple levels (e.g., individual test takers and schools), it needs to be justified for its reliability and distinctiveness at every relevant level. In this study, we examined whether reporting seven Reading and Listening subscores of the TOEFL Primary® test, a standardized English proficiency test for young English as a foreign language learners, could be justified for reporting at individual and school levels. We analyzed data collected in pilot administrations, in which 4776 students from 51 schools participated. We employed the classical test theory (CTT) based approaches of Haberman (2008) and Haberman, Sinharay, and Puhan (2009) for the individual and school-level investigations, respectively. We also supplemented the CTT-based approaches with a factor analytic approach for the individual level analysis and a multilevel modeling approach for the school-level analysis. The results differed across the two levels: we found little support for reporting the subscores at the individual level, but strong evidence supporting the added-value of the school-level subscores when the sample size for each school exceeds 50.

Publisher

SAGE Publications

Subject

Linguistics and Language,Social Sciences (miscellaneous),Language and Linguistics

Link

http://journals.sagepub.com/doi/pdf/10.1177/0265532219879654

Reference16 articles.

1. The Challenge of (Diagnostic) Testing:

2. Statistical Analyses for Language Assessment

Cited by 8 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Developing and using a scalable assessment to measure preservice elementary teachers' content knowledge for teaching about matter;Journal of Research in Science Teaching;2023-09-12

2. A Review of Factors Affecting the Acquisition of Second Language Reading Skills;Linguistics and Literature Review;2023-03-31

3. Evaluating the Use and Interpretation of the TOEIC® Listening and Reading Test Score Report: Perspectives of Test Takers in Japan;ETS Research Report Series;2023-01-23

4. Application of Bi-factor MIRT and Higher-order CDM Models to an In-house EFL Listening Test for Diagnostic Purposes;Language Assessment Quarterly;2021-11-30

5. Reading is a multidimensional construct at child-L2-English-literacy onset, but comprises fewer dimensions over time: Evidence from multidimensional IRT analysis;Language Testing;2021-10-07