Abstract
AbstractOne major aim of international large-scale assessments (ILSA) like PISA is to monitor changes in student performance over time. To accomplish this task, a set of common items (i.e., link items) is repeatedly administered in each assessment. Linking methods based on item response theory (IRT) models are used to align the results from the different assessments on a common scale. This work employs the one-parameter logistic (1PL) and the two-parameter logistic (2PL) IRT models as scaling models for dichotomous item response data. The present article discusses different types of trend estimates in country means and standard deviations for countries in ILSA. These types differ in three aspects. First, the trend can be assessed by an indirect or direct linking approach for linking a country’s performance at an international metric. Second, the linking for the trend estimation can rely on either all items or only the link items. Third, item parameters can be assumed to be invariant or noninvariant across countries. It is shown that the most often employed trend estimation methods of original trends and marginal trends can be conceived as particular cases of indirect and direct linking approaches, respectively. Through a simulation study and analytical derivations, it is demonstrated that trend estimates using a direct linking approach and those that rely on only link items outperformed alternatives for the 1PL model with uniform country differential item functioning (DIF) and the 2PL model with uniform and nonuniform country DIF. We also illustrated the performance of the different scaling models for assessing the PISA trend from PISA 2006 to PISA 2009 in the cognitive domains of reading, mathematics, and science. In this empirical application, linking errors based on jackknifing testlets were utilized that adequately quantify DIF effects in the uncertainty of trend estimates.
Publisher
Springer Science and Business Media LLC
Reference76 articles.
1. Andersson, B. (2018). Asymptotic variance of linking coefficient estimators for polytomous IRT models. Applied Psychological Measurement, 42(3), 192–205. https://doi.org/10.1177/0146621617721249
2. Battauz, M. (2020). Regularized estimation of the four-parameter logistic model. Psych, 2(4), 269–278. https://doi.org/10.3390/psych2040020
3. Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In F. M. Lord & M. R. Novick (Eds.), Statistical theories of mental test scores (pp. 397–479). MIT Press.
4. Brennan, R. L. (2001). Generalizability theory. Springer. https://doi.org/10.1007/978-1-4757-3456-0
5. Cai, L., & Moustaki, I. (2018). Estimation methods in latent variable models for categorical outcome variables. In P. Irwing, T. Booth, & D. J. Hughes (Eds.), The Wiley handbook of psychometric testing: a multidisciplinary reference on survey, scale and test (pp. 253–277). New York: Wiley. https://doi.org/10.1002/9781118489772.ch9
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献