Affiliation:
1. Key Laboratory of Brain, Cognition and Education Sciences (South China Normal University), Ministry of Education, Guangzhou, China
2. School of Psychology, Center for Studies of Psychological Application, Guangdong Key Laboratory of Mental Health and Cognitive Science, South China Normal University, Guangzhou, China
Abstract
In order to investigate the influence of separation of grade distributions and ratio of common items on the precision of vertical scaling, this simulation study chooses common item design and first grade as base grade. There are four grades with 1,000 students each to take part in a test which has 100 items. Monte Carlo simulation method is used to simulate the response matrices by self-made program in R 3.0. As the items are scored by 0/1, we select two-parameter logistic model of Item Response Theory and use BILOG-MG for concurrent calibration with EAP method. The Bias and RMSE are calculated as precision indicators. The results show that: (1) Estimation precision of item and ability parameters differs in different grades. For discrimination and difficulty parameters, estimation precision is higher as closer to the base grade and is lower with the increase of effect size. For the ability parameters, the estimation precision is high generally except for fourth grade which is much lower. The precision is best at 0.5 of effect size in general. (2) There is an interaction between the ratio of common items to total test and effect size. When the effect size is 0.5 and 1.0, estimation precision of each grade is most accurate at 30% of common-item ratio. When the effect size is 1.5, the estimation precision of difficulty parameters is best for first, second, and third grade at 30% of common-item ratio while grade 4 at 15% of common-item ratio. The ability parameters of all grades are all best estimated at 15% of common item ratio. There must be a trade-off between the estimation precision of ability parameters and item parameter if the common item ratio is at the range of 15% to 30%. (3) The choice of base grade affects the accuracy of vertical scaling. When the lower grade is selected as the base grade, if the number of consecutive cumulative conversions from the upper grade test score to the lower grade exceeds 2, there will be a large deviation. Therefore if the senior grade changes to the junior grade, it is suggested that the gap of grades should not exceed 2 grades. As a whole, the proportion of anchor items for vertical scaling is set at 30%, but it is better to set the proportion of anchor items as “variable” value (15%–30%) when considering the separation of grade distributions.
Funder
Natural Science Foundation of Guangdong Province
the Ministry of Education Foundation of the People’s Republic of China