Affiliation:
1. The Danish Institute for Educational Research, Copenhagen, Denmark
2. Department of General and Applied Linguistics, University of Copenhagen, Copenhagen, Denmark
Abstract
In experimental psychology, the degree of difference between the proportions of correctly solved items on two related tests (such as word lists) has been calculated by different methods, for example, a simple difference (e.g. as used in within-subjects ANOVAs), difference relative to potential gain, quotient, difference between standardized z-scores, or by Signal Detection Theory's d’, every one of which may yield different results. The present article discusses the choice of methods with an example from reading research concerned with contextual facilitation in readers with different abilities. Assuming that the total number of correctly solved items captures all relevant variance in subjects’ abilities (i.e. it is a sufficient measure), it is demonstrated that the logarithm of the quotient between odds for the frequencies of correct responses (log odds) is the most suitable method of calculation. For example, calculations based on log odds provide an appropriate ranking of the subjects, from relevant for repeated-measures ANOVAs. The aims of the present paper are to draw attention to the problem of comparing differences, to evaluate current methods of calculation, and to present a consistent solution to the problem. We illustrate the problem by applying several current methods of expressing differences in proportion correct to the same set of data: they yield radically different results. The choice of an appropriate method depends on the assumptions made about the underlying metric. We argue for a solution-the log odds measure-on the basis of Rasch's (1968a, 1968b) measurement model. This relies on a demonstration that, given a set of test items given to a set of subjects, the proportion of correct is, in technical terms, a sufficient statistic -that is, it captures all relevant variation in a subject's ability (or an item's difficulty). The example for the presentation of the problem is selected from recent reading research using comparison of accuracy in two related reading tasks. Reading accuracy with two tasks is a suitable example because it is a simple, much-used design. Researchers
Subject
General Psychology,Experimental and Cognitive Psychology
Cited by
14 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献