Affiliation:
1. NATIONAL DEFENSE UNIVERSITY
2. MUGLA SITKI KOCMAN UNIVERSITY
3. GAZI UNIVERSITY
Abstract
This study aims to compare the Wald test and likelihood ratio test (LRT) approaches with Classical Test Theory (CTT) and Item Response Theory (IRT) based differential item functioning (DIF) detection methods in the context of cognitive diagnostic models (CDMs), using the TIMSS 2011 dataset as a retrofitting study. CDMs, which have a significant potential when determining the DIF and their contribution to validity, can give confidence under the strong methodological background condition is met. Therefore, it is hoped that this study will contribute to the literature to ensure the correct usage of CDMs and evaluate the compatibility of these new approaches with traditional methods. According to the analysis results, thirty-one items showed differences between the cognitive diagnosis assessments and the traditional methods. The item with the largest DIF was found in the Raju Unsigned Area Measures technique in IRT, whereas the item with the lowest DIF was found in the Wald test technique developed for CDMs. In general, the analyses show that methods not based on CDMs detect more items with DIF, but the Wald test and LRT methods based on CDMs detect fewer items with DIF. This study conducted DIF analyses to determine the test's psychometric properties within the framework of CDMs rather than the source of the bias. Researchers can take the study one step further and make more specific assessments about the items' bias regarding the test structure, test scope, and subgroups. In addition, DIF analyses in this study were carried out using only the gender variable, and researchers can use different variables to conduct studies specific to their purpose.
Publisher
Egitimde ve Psikolojide Olcme ve Degerlendirme Dergisi
Subject
Developmental and Educational Psychology,Education
Reference42 articles.
1. Akbay, L. (2021). Impact of retrofitting and item ordering on DIF. Journal of Measurement and Evaluation in Education and Psychology, 12(2), 212-225. https://doi.org/10.21031/epod.886920
2. Asil, M., & Gelbal, S. (2012). Cross-cultural equivalence of the PISA student questionnaire. Education and Science, 37(166), 236-249. https://eb.ted.org.tr/index.php/EB/article/view/1501
3. Camilli, G., & Shepard, L. A. (1994). Methods for identifying biased test items.Sage.
4. Cokluk, O., Gul, E., & Dogan-Gul, Ç. (2016). Examining differential item functions of different item ordered test forms according to item difficulty levels. Educational Sciences: Theory and Practice, 16(1), 319-330. http://dx.doi.org/10.12738/estp.2016.1.0329
5. de la Torre, J. (2008). An empirically-based method of Q-matrix validation for the DINA model: development and applications. Journal of Educational Measurement, 45, 343–362. https://doi.org/10.1111/j.1745-3984.2008.00069.x
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献