Affiliation:
1. The University of Texas at Austin
2. Fordham University
Abstract
The null hypothesis test used in differential item functioning (DIF) detection tests for a subgroup difference in item-level performance—if the null hypothesis of “no DIF” is rejected, the item is flagged for DIF. Conversely, an item is kept in the test form if there is insufficient evidence of DIF. We present frequentist and empirical Bayes approaches for implementing statistical equivalence testing for DIF using the Mantel–Haenszel (MH) DIF statistic. With these approaches, rejection of the null hypothesis of “DIF” allows the conclusion of statistical equivalence, a more stringent criterion for keeping items. In other words, the roles of the null and alternative hypotheses are interchanged in order to have positive evidence that the DIF of an item is small. A simulation study compares the equivalence testing approaches to the traditional MH DIF detection method with the Educational Testing Service classification system. We illustrate the methods with item response data from the 2012 Programme for International Student Assessment.
Publisher
American Educational Research Association (AERA)
Subject
Social Sciences (miscellaneous),Education
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献