Affiliation:
1. Defense Personnel Analytics Center, USA; Defense Language Institute Foreign Language Center, USA
Abstract
This study evaluated the efficacy of two proposed methods in an operational standard-setting study conducted for a high-stakes language proficiency test of the U.S. government. The goal was to seek low-cost modifications to the existing Yes/No Angoff method to increase the validity and reliability of the recommended cut scores using a convergent mixed-methods study design. The study used the Yes/No ratings as the baseline method in two rounds of ratings, while differentiating the two methods by incorporating item maps and an Ordered Item Booklet, each of which is an integral tool of the Mapmark and the Bookmark methods. The results showed that the internal validity evidence is similar across both methods, especially after Round 2 ratings. When procedural validity evidence was considered, however, a preference emerged for the method where panelists conducted the initial ratings unbeknownst to the empirical item difficulty information, and then such information was provided on an item map as part of the Round 1 feedback. The findings highlight the importance of evaluating both internal and procedural validity evidence when considering standard-setting methods.
Reference21 articles.
1. American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (2014). Standards for educational and psychological testing. American Educational Research Association. https://www.testingstandards.net/uploads/7/6/6/4/76643089/standards_2014edition.pdf
2. An NCME Instructional Module on: Setting Passing Scores