How tests are scored

David Kuntz is the Vice President of Research at Knewton, where he works on perfecting the algorithm for its GMAT prep course.

We’ve received grades all our lives. In fact, we’re so used to them that we often don’t think very much about what they mean, or how they are calculated. So today we’re going to look at some of the different ways in which tests are scored, and at what those scores mean.

In preschool, we receive grades in the form of category scores: gold stars, silver stars, or bronze stars. Sometimes we might get two gold stars, or even three gold stars. These kinds of grades divide the relevant universe of people into some small number of categories, usually low-medium-high.

Later on we start to receive simple tally scores: 8/10 or 23/25. Soon these are represented as percentages: 80% correct, or 92%. One of the funny things about grades is that by the time we’re in high school and college, grades have reverted back to category scores (A, B, C, D, F) through a transformation of the percentages.

Every teacher and school adopts slightly different transformations. In some places, a grade of A is reserved for 96% and above. In other places the cutoff is 92%. In still others, it might be 90%. So what an “A” means can vary widely from place to place.

Everyone knows that some test questions are more difficult than others. Occasionally, teachers will take this into account by awarding more points for the hard questions than for the easy ones.

The basic sequence for most kinds of scoring is this:

Count the number of questions, or the number of points associated with each question, that you answered correctly.
Subtract, if applicable, any penalty for incorrect answers. This result is your “raw score.”
Apply some transformation to your raw score (e.g., divide by total possible points, or use some more complicated function) to arrive at your “scaled score.”

For those of you taking the GMAT, the basic sequence is very different. Because the GMAT is an adaptive test, it looks at your performance on each question as you respond to it, and estimates your math or verbal ability along the way. Then it uses that ability estimate to calculate your score. For the GMAT, the basic sequence is:

Deliver a test question. Based on your answer, estimate your ability, based on a number of factors, including the difficulty of the question.
Based on the current estimate of your ability, select a question that will maximize the amount of information that can be used to refine the ability estimate.
Loop through (1) and (2) until the test is complete.
Apply a transformation to the resulting estimate of your ability to determine your section score.
When you have completed all sections of the test, apply a transformation using all of the resulting ability estimates to determine your overall score.

What the GMAT does explicitly is what all tests try to do implicitly, namely, try to ascertain what you know and are able to do, in some context or another. It’s a more responsive way of testing, and we use the same adaptive technology in our GMAT practice tests.

In a later post, we’ll talk about validity, which has to do with what your score really means within a context, and why anyone would care.

Until then, do your homework!

cesar anchante

Some aspect of GMAT practice tests is the fact that we get penalized for not answering questions. I was struggling with low scores (400-500) for a while until I realized that not answering some questions and finishing each section increased my scores by 100 (500-600). The penalty for not finishing a section can make a great difference.

Comments are closed.