Imagine there were 100 patients earlier out of whom 80 had appendicitis and 20 didn’t. But all 100 were given surgery (although only 80 were necessary and 20 were not necessary).
With the test that is 98% accurate, there are 2 extreme possibilities:
Case 1:
80 people have appendicitis; the test says all 80 have it.
20 people don’t have appendicitis; the test says 18 don’t have it and 2 have it. In this case 18 out of 20 unnecessary will be avoided and the doctors will still perform 82 surgeries (more than the necessary number 80). So this satisfies all the conditions. This case will be obtained when someone who doesn’t have appendicitis is diagnosed to have it. Option B says the same.
Case 2:
20 people don’t have appendicitis; the test says all 20 don’t have it.
80 people have appendicitis; the test says 78 have it and 2 don’t have it. In this case 2 necessary ones will also be avoided. The doctors will perform 78 surgeries (fewer than the necessary number 80). So this doesn’t satisfy any condition.
In a test, a false positive refers to a diagnosis that mistakenly indicates that a condition, disease or infection is present. A false negative refers to a diagnosis that mistakenly indicates that a disease, infection or condition is absent. A false positive result from a doping test could ruin the career of an honest cyclist. A false negative result on a paternity test could prevent a father and son from reuniting.
Clearly, using this test, doctors can largely avoid unnecessary removals of the appendix (eliminate false positives) without, however, performing any fewer necessary ones than before (i.e. without producing more false negatives), since .....
It seems clear that before this test was developed, doctors removed the appendix of everybody who either had appendicitis or seemed to have it (false positives).
This test has an accuracy rate of 98%, but in order for the conclusion to be true, these few mistakes must involve cases in which people without appendicitis are deemed to have it (false positives) , not the other way around. In other words, these mistakes cannot involve genuine cases of appendicitis that are classified as having nothing to do with appendicitis (false negatives), or else doctors would be performing newer necessary operations (i.e. operations on appendicitis patients) than before.
The part in bold basically means that, with the test, they'll still perform the same number of necessary operations (as they used to). In other words, they'll catch people who have appendicitis just as much as they used to.
To complete this argument we need to find some evidence that supports the conclusion (notice the keyword "since").
How can we support the conclusion that they'll catch just as many people who have appendicitis as they used to?
Well, if the 2% error rate is exclusively due to the test saying you have appendicitis when you don't (rather than not catching your appendicitis), then the author's argument is supported (since the error rate without the test is 20%)...that's essentially what choice B says.
If the 2% error rate were due to the test not catching your appendicitis, then the author's conclusion that the test would decrease the number of unnecessary operations is clearly weakened: the test would be decreasing the number of NECESSARY operations--clearly a bad outcome. Because the denial of choice B hurts the argument, choice B must be evidence that supports the argument.
Kudo it if you like the detailed explanation!