Veritas CAT Quant score too high?

Question

Hi experts (especially Veritas experts),

I purchased the 7 CAT package of Veritas and so far did 4 of the CATs with the following results:

Veritas: 
Oct. 30: Q45
Nov. 24: Q42 V33 --> 620
Nov. 28: Q43 V33 --> 620
Dec. 02: Q46 V31 --> 630

GmatPrep:
Nov. 16: Q42 V33 --> 620

After my last CAT I was very happy that the endless efforts of studying GMAT quant have shown results. But than I took a in-depth look into the Quant section and I figured that the level of the questions was super low and that the questions did not become harder over the test. My exam is on the 21.12.16 and I want to be sure that I will break the 600 barrier. I planned to do my last GmatPrep 9 days befor the exam and eventually reschedule, if it is below 630. (I was thinking of 30 Points STD)

My question now is whether the Quant result is legit or inflated and not meaningful.

All the Best,

AK

OptimusPrepJanielle · Answer

Hi AK125,

The best way to measure your current standing is by taking a GMAT prep mock CAT.
I would recommend you to take another GMAT prep test and rely on that for your current standing on the Quants section.

AK125 · Answer

Thanks for your reply.

I will definitely do that.

However, I am wondering how my score can be a 46 with mainly sub-600 questions. Or would you generally neglect the difficulty level assigned by veritas and only look at the final score?

VeritasPrepBrian · Answer

Hey AK125,

All good questions, and with what you wrote I was able to dig in and take a look at the back end of your test. From what I can see, everything looks good (the error margin is within the tolerance we like, and the trend in "theta values" all makes logical sense given how you responded). A couple things for you:

-Adaptive scoring is all about probabilities. The system tries to gauge your ability by looking at your responses and calculating the probability that someone with those responses would be at the 99th percentile, the 95th, the 90th, etc., and its "ability estimate" of you is based on which ability level carries the highest probability at that point. And it delivers questions, too, by scanning the pool of available questions and looking for questions that would have a high probability of providing valuable information about you. So it's very common for the system to deliver you a question that's a bit below its current estimate of your ability, just because that problem has a high probability of helping the system learn more about you in that range (say, right now the system thinks you're in the 610-670 range, your missing that "easier" problem may help the system realize that you're highly unlikely to be above 660, but getting it right might help to cement your floor at 620). Because of that, you can't look at "a 550-level question must mean that the system thinks I'm below 600." It may just have a high probability of helping the system learn more about your ability near but not exactly at the "difficulty level" of that problem.

-Which brings up another nuanced point about Item Response Theory - the psychometricians behind IRT don't use the term "difficulty level" for questions...that's a test-taker and tutor kind of way of thinking about the problems. They look at the "b-value" which is the ability level at which the question provides the most information about examinees. It's similar to difficulty but not really "difficulty," and what's important about that is that wherever the b-value may lie (say at the 60th percentile) that problem still has a lot of predictive value for ability levels surrounding that. So, again, if the system serves you a 600-level problem it's not necessarily because it doesn't think you can handle a 650...it's just that the system believes it will get more information from that problem than from one at the 650 level, even though it might think you're closer to 650 than to 600 at that moment.

Which even as I'm reading that back may not sound all that convincing, but consider an example like professional sports. The best team in the English Premiership or the NBA never goes undefeated. Even though a great team may never have less than a 60% chance of winning any given game (after all, it's better than any other team), you can learn a lot about that team by seeing how it performs over a 10-game stretch when its likelihood of winning any one game is 70%. (Think about that probability...a 70% chance of winning 1 game means a 49% chance of winning two in a row, and less than 25% of winning 4 in a row). Question delivery is similar - the system can learn a lot about you from seeing how you handle problems that are below your ability level, as well as learning from problems that are above your ability level.

-And I think that builds to this really important part - we in these forums and in classrooms and in textbooks and blog posts...we try to personify the scoring algorithm to make it make sense. But it's just a big data computer. So it's not "thinking" about your ability (hey, so AK125 got this 600-level problem right...I wonder if he's ready for a 650...). It's just assessing the data and assigning questions - whether at, above, or below - its estimate of your ability, based on how much more information it can get about you with the next question. Which can sometimes feel a little underwhelming or disappointing, again because we tend to personify the test and feel like if we got 2-3 questions in a row we've "earned" a "harder" question. But the system doesn't work that way - it isn't concerned with appearances, but rather just mathematically going about its job.

In your case, there was a stretch there in the 20s where you got a string of a few questions that came in at a slightly lower b-value than your ability estimate, which makes it look based on question delivery that you were doing worse than you were, but then by the 30s you got more questions above your level. But all in all the overall trend matches up with where you ended up scoring and the error margins were right where we'd want them, so you should feel pretty confident that you scored where you were supposed to.

*THAT* said...remember it's all probabilities, so a Q46 means that of all the available scores, it's most likely that you're a 46 and less-likely-but-still-reasonable that you're a 45 or 47 and even less likely but not out of the realm of possibility that you're a 44 or 48. So with any practice test score keep that in mind.

AK125 · Answer

Wow, that was a great answer! 
Now I am able to understand the crazy scoring process a little bit better!

Beside that I took a gmatprep and scored a Q45, so yes the Q46 seems legit.

Thanks,

AK125

Kurai · Answer

I can't speak for anyone else but my own experience, but it does seem pretty accurate in terms of raw score.
I was scoring Q49-50 for almost all my exams. I did however have one instance where I scored a Q51 with over 10 wrong which was odd to me.
Overall, I only did the Quant section of the CAT and found it good practice.

For the best indication of real GMAT score, as others have suggested, please take the GMAT Prep.

Veritas CAT Quant score too high?

Prep Toolkit

Top 5 GMAT Debrief Videos

615 to 715 in 15 Days (Ria's Strategies)

More than Quant/Verbal Abilities: Speed vs. Accuracy

745 with Only Self-Prep and GMAT Club

From 555 to 765: 210-point Improvement

Perfect 805 Score Debrief by Julia

My Rewards