Many observers have criticized the difficulty of the questions that appear in the
OG (
the Official Guide). The problem is not so much that the
OG items are too easy as it is that the
OG, like a traditional pencil and paper test, features items that range across the p scale. The high probability items (the easy questions) are generally of little interest to students who are seeking scaled scores at the far right tail. Suppose we group items into seven difficulty strata with the middle stratum called stratum x. The majority of items that one would encounter on a traditional pencil and paper test would come from the middle strata near x (i.e. x-1, x, x+1). In a traditional test, you would encounter very few questions at the top of the range (x+3) since these questions have undesirable characteristics for most test takers (it would provide low differentiation (discrimination) among low and medium ability test takers).
Now suppose we create an adaptive test from this body of traditional questions. The first test item administered to each student would come from the middle stratum (stratum x). If it is answered correctly, we move to stratum x+1 for the second question. If the first question is answered incorrectly, we move to stratum x-1 for the second question. If the first two questions are both correct, we move to stratum x+2. If this questions is answered incorrectly, we move back down to stratum x+1. At the end of the test we would combine the information concerning difficulty and number of questions answered correctly and incorrectly to obtain an estimate of the net number of items this student would have answered correctly on a traditional paper based test (this is the raw score). The estimated raw score on the paper test is then used to read the associated scaled score from the paper test's raw to scaled score conversion table.
Hjort