Sampling in Official GMAT CR Questions

Question

https://gmatclub.com/blog/wp-content/uploads/2013/08/Picture5-e1377882587945.png INTRODUCTION The objective of this article is to introduce a concept of statistics, which has been tested a number of times in the GMAT Critical Reasoning section and to use this concept to solve official CR arguments. Even though the concept is fairly simple to understand and apply, its ignorance leads to students’ inability to select the correct option statements in such questions and ultimately to students paying the price in the form of an incorrect response CR ARGUMENTS BASED ON RESEARCH FINDINGS The concept that we will discuss in this article is: Representative sample. The understanding of this concept is tested in GMAT CR arguments, which draw conclusions based on the findings of research studies. An example of a research study could be a study which seeks to estimate the average score expectation of GMAT Test takers, just before they enter the exam hall and their actual average scores. Basically, a research study collects contextual data from the real world and draws conclusions on the basis of its analysis of the data. In the context of GMAT, we need to understand one important characteristic of all research studies. The characteristic is that a research study deals with a sample but makes estimations about the entire population from which the sample is taken. To understand this characteristic, let’s first understand the terms: sample and population. SAMPLE AND POPULATION Let’s understand these terms with an example of the research we talked about: a research which seeks to estimate the average score expectation of GMAT Test takers just before they enter their exam hall and their actual average scores. In this research, the population is the set of all GMAT test takers. This is so because the study is interested in finding information about the set of all GMAT test takers. Purpose = find the average GMAT SCORE of a population of GMAT test takers. Population : Population is the entire group of entities (people, things, animals etc) we are interested in. It is the entire group we wish to understand or draw conclusions about. However, we can see that collecting the required data for all the GMAT test takers may not be feasible for a research study because of the time and money required to do so. Therefore, generally a research study will select a small proportion of the GMAT test takers and collect data on these test takers. The set of the test takers selected for the research is called ‘sample’. Sample : A sample is a group of units selected from a larger group (the population) or in other words, a sample is a subset of the population. A sample is generally selected for study because the population is too large to study in its entirety. In the given research study, the sample could consist of • 100 randomly selected people from fifteen different countries or • all the GMAT test takers who take the GMAT at a particular test center over a one month period or • all the GMAT test takers who are registered at GMATClub So, a sample could be taken in several ways and as we can understand, the way a sample is taken will have an impact on the data we collect. After the data is collected, based on this data (average expected score & actual average score), the research will make such estimates about the population i.e. all GMAT test takers. MORE EXAMPLES OF SAMPLE AND POPULATION https://gmatclub.com/blog/wp-content/uploads/2013/08/population.jpg EXAMPLE 1 Let’s consider a research study: A study wants find the average income of a population of a city consisting of different income groups by measuring the average income of a section of the population. What is the population in this case? The population is the entire population of the city. What is the sample? The sample is the section of people whose average of income will be taken in the study EXAMPLE 2 Let’s consider another example of a research study: A study wants to estimate the average salinity of Red sea by measuring the salinity of one million liters of water from the Red sea. What is the population in this case? The population will be the entire sea water of Red Sea. What is the sample? The sample is the one million liters of water that will be taken from Red sea. REPRESENTATIVE SAMPLE As we have learnt so far, research studies use sample data to make estimates about the population data. Now, for sample data to give correct estimates of the population data, the chosen sample should be representative of the actual population or in other words, the chosen sample should give a true picture of the population. A REPRESENTATIVE SAMPLE IS A SAMPLE WHICH GIVES A TRUE PICTURE OF THE POPULATION OR ONE WHOSE CHARACTERISTICS ARE IN LINE WITH THE CHARACTERISTICS OF THE POPULATION. EXAMPLE 1 https://gmatclub.com/blog/wp-content/uploads/2013/08/sample.jpg If we want to find the average income of a population consisting of different income groups, then the sample must have people from all income groups in the same proportion as they constitute the population. So, if there are five sections in the population, each representing 20% of the population, then the sample must also have each of these sections constituting 20% of the sample. If the sample contains members of one group more than the members of the other group, then the average income of the sample will be quite different from the average income of the population. For a sample to give a correct estimate of the population, the sample’s characteristics have to be in line with the population’s characteristics or in other words, the sample has to be representative of the actual population. EXAMPLE 2 If we seek to find the density of the sea water and we collect a sample from an area where the water density is abnormally high, would the sample data collected in this case give us the right estimates about the population data (i.e. sea water density)? The answer is No. For a sample to give a correct estimate of the population, the sample’s characteristics have to be in line with the population’s characteristics or in other words, the sample has to be representative of the actual population. In our sea water example, one way to make sure that the sample water is representative is that we collect water from different areas of the sea and mix them together to form a sample. In this way, we can make sure that we are not ending up selecting water which has abnormally higher or lower density than the sea water. EXAMPLE 3 Similarly, in the GMAT test takers research, if we somehow choose a sample of only high scorers, then the sample data collected will not give us the correct estimates of the population data. In such case, the average actual score and most probably, average expected too, would be higher than such figures for the population of all GMAT test takers. BUILDING STRENGTHENERS, WEAKENERS AND ASSUMPTIONS WEAKENER Now, suppose we have an argument which draws a conclusion based on the findings of a research study. In such case, what would happen if we say that the sample used by the research study was not representative of the actual population? The answer is that our trust in the conclusion will be significantly weakened since the conclusion depended on the validity of the research findings. Therefore, a statement suggesting that the sample used in conducting the research was unrepresentative would be a valid weakener for the argument. STRENGTHENER Now, suppose if we say just the opposite - we say that the sample used by the research is actually representative of the population - would our belief in the conclusion be strengthened? The answer is Yes. Now, with this additional information, we are surer of the research findings and hence our belief in conclusion which depends on the findings has increased. ASSUMPTION Based on the weakener we have discovered, can you think of an assumption made in the argument which draws a conclusion based on the research findings? The assumption is that the sample used in the research was representative of the actual population. This assumption is required because if this is not true, the research finding will not be believable and the conclusion will break down. Now, let’s look at two OG questions which use this understanding of representative sampling: one weaken question and one assumption question. https://gmatclub.com/blog/wp-content/uploads/2013/07/gmat2-banner11-e1373945998281.jpg OG QUESTION – WEAKEN Solve this question yourself before reading the analysis: A study of high blood pressure treatments found that certain meditation techniques and the most commonly prescribed drugs are equally effective if the selected treatment is followed as directed over the long term. Half the patients given drugs soon stop taking them regularly, whereas eighty percent of the study's participants who were taught meditation techniques were still regularly using them five years later. Therefore, the meditation treatment is the one likely to produce the best results. Which of the following, if true, most seriously weakens the argument? A. People who have high blood pressure are usually advised by their physicians to make changes in diet that have been found in many cases to reduce the severity of the condition. B. The participants in the study were selected in part on the basis of their willingness to use meditation techniques. C. Meditation techniques can reduce the blood pressure of people who do not suffer from high blood pressure. D. Some of the participants in the study whose high blood pressure was controlled through meditation techniques were physicians. E. Many people with dangerously high blood pressure are unaware of their condition. ANALYSIS The answer to the question is option B. Let’s understand this: What is the population for the study mentioned? The population is the set of all high BP (Blood Pressure) patients. Sample is the set of all the participants of the study. Now, given that option B says the people chosen in the sample were those who were more willing to use meditation techniques, can we call this sample a representative sample? The answer is No. The sample is rather biased i.e. it has people who were more willing to use meditation techniques than the prescribed drugs. A representative sample would have been the one that was selected without considering the willingness of the people of one treatment over the other. Now, since the sample used in the study is not representative, we cannot believe in the in the results of study. Since we cannot believe in the results of the study, there is no basis to believe in the conclusion of the argument, which used the research findings as its premises. Rather, in the study conducted using the biased sample, it was expected that people in the study would use meditation techniques over longer term than medications because the study selected only those people who were more willing to use meditation techniques. So, even if in general population, people might not actually use meditation techniques over the longer term than medications, the study will still support meditation techniques because the chosen sample was biased to favor meditation techniques. Therefore, we can see that option B creates doubts on the conclusion and hence, is a valid weakener. OG QUESTION – ASSUMPTION The Earth's rivers constantly carry dissolved salts into its oceans. Clearly, therefore, by taking the resulting increase in salt levels in the oceans over the past hundred years and then determining how many centuries of such increases it would have taken the oceans to reach current salt levels from a hypothetical initial salt-free state, the maximum age of the Earth's oceans can be accurately estimated. Which of the following is an assumption on which the argument depends? A. The quantities of dissolved salts deposited by rivers in the Earth's oceans have not been unusually large during the past hundred years. B. At any given time, all the Earth's rivers have about the same salt levels. C. There are salts that leach into the Earth's oceans directly from the ocean floor. D. There is no method superior to that based on salt levels for estimating the maximum age of the Earth's oceans. E. None of the salts carried into the Earth's oceans by rivers are used up by biological activity in the oceans ANALYSIS The answer to the question is option A. Let’s understand this: This question is quite tricky since we are making two levels of estimations. One estimate is what we are given in the passage i.e. estimate of the earth’s age. However, to get an estimate of the earth’s age, we need to get an estimate of the rate of increase of salt level. This is because as given in the passage, we are going to calculate the maximum age of the earth using the formula: Current salt level of the ocean/rate of increase of salt level. So, suppose if we have the current salt level as 100 and we estimate the rate of increase of salt level as 2, then we can say that the maximum age of the earth is 100/2 i.e. 50. Therefore, to estimate the maximum age of the earth, we need to estimate the rate of increase of salt level. To get a correct estimate of the earth’s age, we need to have the average rate of increase of salt level from the beginning of the formation of the earth till now. Since, we are looking at a period from the beginning of the earth till now, the population is this entire period. The sample in this case is the last 100 years. From our understanding of representative samples, we know that to get any correct estimates, the rate of increase of salt level in the last 100 years should be representative of the rate of increase of salt level of the entire period. Option A communicates the same message by saying that the sample is not unrepresentative i.e. increase in salt levels over the past 100 years have not been unusually large. Moreover, if we negate option A, we have that the quantities of dissolved salts deposited by rivers in the Earth's oceans have been unusually large during the past hundred years. This negated statement means that the sample is unrepresentative. Thus, negating option A breaks down the conclusion. Therefore, option A is the required assumption. Here, I just want to talk a bit about option E because that is the most confusing option statement. For option E to be an assumption, it must pass negation test. When we negate option E, we have that some of the salts carried into the Earth's oceans by rivers are used up by biological activity in the oceans. Does that break down the conclusion? No. If the rate of consumption of salt by the biological activity is not unusually higher or lower in the last 100 years than it has been from the beginning of the earth, we have a representative sample and thus, we’ll have a correct estimate of the earth’s age. This means that even if some of the salts are used by the biological activity, we can still accurately estimate the earth’s age. Therefore, negation option E does not break down the conclusion. TAKE AWAYS So, here is what we have learnt from this article: 1. For a sample to provide a correct estimate of the population, it must be a representative sample i.e. it must have the same characteristics as the population 2. An argument which draws a conclusion based on a research findings: a. Can be weakened by suggesting that the sample chosen was not representative of the population b. Can be strengthened by suggesting that the sample chosen was indeed representative of the population c. Is based on the assumption that the sample chosen in the study was representative of the population. https://gmatclub.com/blog/wp-content/uploads/2013/07/gmat2-banner11-e1373945998281.jpg EXERCISE QUESTIONS - OG 1. Often patients with ankle fractures that are stable, and thus do not require surgery, are given follow-up x-rays because their orthopedists are concerned about possibly having misjudged the stability of the fracture. When a number of follow-up x-rays were reviewed, however, all the fractures that had initially been judged stable were found to have healed correctly. Therefore, it is a waste of money to order follow-up x-rays of ankle fracture initially judged stable. Which of the following, if true, most strengthens the argument? A. Doctors who are general practitioners rather than orthopedists are less likely than orthopedists to judge the stability of an ankle fracture correctly. B. Many ankle injuries for which an initial x-ray is ordered are revealed by the x-ray not to involve any fracture of the ankle. C. X-rays of patients of many different orthopedists working in several hospitals were reviewed. D. The healing of ankle fractures that have been surgically repaired is always checked by means of a follow-up x-ray. E. Orthopedists routinely order follow-up x-rays for fractures of bone other than ankle bones. 2. Frobisher, a sixteenth-century English explorer, had soil samples from Canada's Kodlunarn Island examined for gold content. Because high gold content was reported, Elizabeth I funded two mining expeditions. Neither expedition found any gold there. Modern analysis of the island's soil indicates a very low gold content. Thus the methods used to determine the gold content of Frobisher's samples must have been inaccurate. Which of the following is an assumption on which the argument depends? A. The gold content of the soil on Kodlunarn Island is much lower today than it was in the sixteenth century. B. The two mining expeditions funded by Elizabeth I did not mine the same part of Kodlunarn Island. C. The methods used to assess gold content of the soil samples provided by Frobisher were different from those generally used in the sixteenth century. D. Frobisher did not have soil samples from any other Canadian island examined for gold content. E. Gold was not added to the soil samples collected by Frobisher before the samples were examined. Hope this helps

-Chiranjeev Singh PS: Well, if, by any chance, you are still wondering about the tiger and cat image in the beginning, then you should know that the image means that even though a sample is generally smaller than the population, it needs to have the same characteristics as the population

knightofdelta · Answer

Very interesting stuff.

Answers

Question 1:  C . X-rays of patients of many different orthopedists working in several hospitals were reviewed.

Option C will strengthen the argument because because the different samples of patients taken from many different orthopedists will cover a very wide range of the population covering both patients and doctors. This will reduce the bias towards a particular set of the population and thus make the argument stronger

Question 2:

E.  Gold was not added to the soil samples collected by Frobisher before the samples were examined.
The argument states that the methods used to determine the gold content of Frobisher's samples must have been inaccurate. If the methods were indeed inaccurate, it means that the soil was not tampered with, that was why the the gold content of Frobisher's samples were high. By negating the statement and saying that Gold was added to the soil samples collected by Frobisher before the samples were examined, the argument breaks down completely because if Gold was added to the sample, it means that the methods used to determine the gold content of Frobisher's samples must have been accurate.

blueseas · Answer

great article as usual.

i think the below question also comes under same category.

1)Guidebook Writer: I have visited hotels throughout the country and have noticed that in those built before 1930 the quality of the original carpentry work is generally superior to that in hotels built afterward. Clearly carpenters working on hotels before 1930 typically worked with more skill, care, and effort than carpenters who have worked on hotels built subsequently.

Which of the following, if true, most seriously weakens the guidebook writer’s argument?

A. The quality of original carpentry in hotels is generally far superior to the quality of original carpentry in other structures, such as houses and stores.
B. Hotels built since 1930 can generally accommodate more guests than those built before 1930.
C. The materials available to carpenters working before 1930 were not significantly different in quality from the materials available to carpenters working after 1930.
D. The better the quality of original carpentry in a building, the less likely that building is to fall into disuse and be demolished.
E. The average length of apprenticeship for carpenters has declined significantly since 1930.
guidebook-writer-have-visited-hotels-throughout-the-country-80358.html#p603511

bagdbmba · Answer

IMO 1-C , 2-B Chiranjeev- can you please confirm the same! Please keep the 700+ questions coming like SC

egmat · Answer

Thank you for the appreciation

Yes, you are correct - this question also comes uses the concept of "representative sample" but applying this concept on this question requires a bit of nuanced understanding

Thank you for posting the question on this thread. -Chiranjeev

egmat · Answer

Hi, I'll post the answers on Monday. I am expecting a few more replies before I post the answers.

Regarding your demand for 700+ questions, we'll be posting two fresh e-GMAT CR questions on Monday. So, be on the lookout. -Chiranjeev

bagdbmba · Answer

Great Sir!
Look forward to it...

egmat · Answer

Hi,

The correct answers are C and E.

Thanks,
Chiranjeev

bagdbmba · Answer

Thanks Chiranjeev...
Oops! I got one wrong...Just a quick clarification - when I've narrowed down to two options in 'Assumption question' , is it the best time to use negation test right there instead of using it upfront as latter case will take much time?

Can you please explain the question of 'Guidebook Writer:' posted here by blueseas?

Appreciate your reply.

Skag55 · Answer

First of all great effort on summing up a few best practices and lessons on CR!

Secondly, I'm a bit dubious on your 2nd example on Assumption type questions:

E.  None  of the salts carried into the Earth's oceans by rivers are used up by biological activity in the oceans

If we negate that, shouldn't it be " All  of the salts carried into the Earth's oceans by rivers are used up by biological activity in the oceans", i.e all opposite of none ?

Therefore, if we do that, the argument also falls apart, no?

egmat · Answer

Hi, Thanks for the appreciation!

The negation of "None" is "some", not "all". Why so? Let's first understand: what does negation mean logically? Let's suppose we have a statement A. In that case, negation of A, let's say ~A, will be a statement which 1. is always false when A is true 2. is always true when A is false In essence, 1. at any time, either A or ~A should be true 2. and at no time, both A and ~A can both be true (or false). Now, if we apply this understanding to "None" and "All", we can see that these are not negations of each other. So, if 50% of the salts are used up by biological activity, then both "None of the salts are used up by biological activity" and "All of the salts are used up by biological activity" are false at the same time. Neither is true. So, they are not negations of each other. The correct negation of "None" is "some". "Some" means anything greater than zero (including "all" or 100%) Thanks, Chiranjeev

Skag55 · Answer

Aha! That clears up things. So to sum it up, "negation" is not "opposite", it's "anything else but  this ".

I perceived it as 
"opposite of 1 is -1" (if you think in the X axis for instance), instead of 
"not 1 is 0, -1, 52 etc.."

Thanks for the clarification!

bagdbmba · Answer

Hi Chiranjeev,
Could you please shot a reply on the above concerns?

Would much appreciate your feedback.

egmat · Answer

Hi, Sorry for the delay! Lots of things are going on at the same time. Yes, you are correct. You should use negation test only when you have narrowed down to two or at the most three option statements. Of course, time is one factor but also, if you need to apply negation test on all option statements, then it tells that you have not really understood the passage clearly. What is your take on the question? Can you identify the sample and the population? Also, how does option D, which is the correct choice, indicates that the sample was not representative? I think you learn the best when you do it yourself. I'll provide my analysis after you do this

Thanks, Chiranjeev

bagdbmba · Answer

Hi Chiranjeev, I think(NOT 100% confident although), here sample is - the hotels (both pre-1930 and post-1930) that the Guidebook Writer has visited throughout the country. Whereas population - ALL the hotels built pre-1930 and post-1930. Not getting option D as it appears to me as a general statement!

Although marked on the basis of the fact that other options could not be correct.... But, if you ask me to provide solid reasoning why I chose D on its own, well, I don't have an answer honestly. Please provide your analysis...

egmat · Answer

Amazing stuff. You are bang on in finding the population and the sample.

Now, let's look at the argument and option D:

Guidebook Writer: I have visited hotels throughout the country and have noticed that in those built before 1930 the quality of the original carpentry work is generally superior to that in hotels built afterward. Clearly carpenters working on hotels before 1930 typically worked with more skill, care, and effort than carpenters who have worked on hotels built subsequently.

D. The better the quality of original carpentry in a building, the less likely that building is to fall into disuse and be demolished.

What does option D mean?

Let's consider a time period: 1800 A.D. to 2000 A.D. Suppose, the argument was written in 2000 A.D. ( Please note that I am using these date to explain the logic; you can use any other dates; the logic will remain the same)

Also, suppose that the "average" life of a hotel is 100 years.

Now, in such a case, we'll have a very large proportion of hotels created after 1930 still in existence. Right? 
On the other hand, a large proportion of hotels created from 1800 to 1930 would not exist now.

Now, of the hotels built from 1800 to 1930, which hotels would exist now? The ones with the best original carpentry.

So, when the guidebook writer i.e. the author of the argument visits any hotel built before 1930, he is going to see a really good carpentry. The reason is that only those hotels built before 1930 exist now which have the best original carpentry.

But these hotels that exist now are not representative of all the hotels built before 1930.  Only the best exist now; the rest have perished . So, the average quality of carpentry in these hotels will be much greater than the average quality of carpentry of all the hotels built before 1930.

Since, the sample of hotels built before 1930 that exist now (or that the author visited) is not representative of all the hotels built before 1930, the conclusion is greatly weakened.

Does it help?

Thanks,
Chiranjeev

bagdbmba · Answer

Hi Chiranjeev,
Thanks for your detailed analysis but I'm unfortunately not able to understand how D is weakening the argument actually ?

Additionally, here we're showing that the 'sample of hotels built before 1930 that exist now (or that the author visited) is not representative of all the hotels built before 1930' by assuming that "average" life of a hotel is 100 years within the stipulated time period.

But what if "average" life of a hotel is 300 years - it could have been assumed as well I guess...! In this case, then can we say that the above sample is NOT the proper representative of the pre-1930 hotels ? If yes, then please let me know how?

egmat · Answer

Yes, in case of 300 years also, the same logic will apply. Remember, I considered the period from 1800 onwards, if 300 years is the average age, then we'll have hotels built even before 1700 still in existence. 300 years is the average life: some hotels may have more than average life and some may have less than average life.

So, now if we consider period from 1600 onwards, the same logic will apply as the logic applied in case of 100 years and 1800 A.D.

The more you increase the average life, the more back in the time we'll need to go because in case of higher average life, we'll have more older hotels still in existence.

An analogy to understand this:
Suppose you read books of an author X who died 100 years back and books of an author Y who is currently. Then, you say that average quality of books of X is better than that of Y.

So, you read some books (sample) and you made a conclusion about all the books (population).

Now, if someone tells you that only those books of X are preserved now which were his best and the rest of the books were actually discarded, can you still make a claim that the average quality of all books of X is better than average quality of all books of Y? No. Why? Because you have read only the best books of X but all kinds (best, not so best, bad) books of Y.

Hope this helps.

Thanks,
Chiranjeev

bagdbmba · Answer

Hi Chiranjeev,
Thanks for the analysis.+1

I got your analogy clearly but still having some confusion on the above part...it seems to me that we're here applying some hard fast rule in order to make option D correct.

The better the quality of original carpentry in a building, the less likely that building is to fall into disuse and be demolished.   - how does it say something like in your analogy  only those books of X are preserved now which were his best and the rest of the books were actually discarded ...NOT able to relate these two parts.

Option D could also mean that most of the pre-1930 hotels exist today because of having superior quality of original carpentry...!

Please help. I'm really having a difficult shot to understand it.

egmat · Answer

Ok. Let me use some numbers here.

Suppose there are 1000 hotels that were created before 1930 and 1000 hotels that were created after 1930.

For the hotels created before 1930,

A1:Quality of 300 hotels = 100
A2: Quality of 400 hotels = 200
A3: Quality of 300 hotels = 300

Average quality = 200

For the hotels created after 1930,

B1: Quality of 300 hotels = 100
B2: Quality of 400 hotels = 200
B3: Quality of 300 hotels = 300

Average quality = 200

So, average quality for hotels from both period is same.

Now, let's bring option D into the picture.

The better the quality of original carpentry in a building, the less likely that building is to fall into disuse and be demolished

Now, tell me of hotels created before 1930, which hotels are more likely to exist now. A3? Right. After that? A2?

So, now let's suppose currently we have the following pre-1930 hotels left

A1: 150 (half of them got demolished)
A2: 300 (25% of them got demolished)
A3: 300 (None of them got demolished)

Average quality now = 2.2

For hotels built after 1930

B1: 240 (20% of them got demolished) (Since these hotels are built after 1930 and are newer than those built before 1930, the ratio of hotels that got demolished will be lower)
B2: 360 (10% of them got demolished)
B3: 300 (None of them got demolished)

Average quality now: 2.06

So, even though initial quality is same, because of option D, you see final quality of pre-1930 better than post-1930 hotels.

Just to emphasize, I did not chose these number to make option D correct; I chose them out of logic.

Let me know if it addresses your doubts.

Thanks,
Chiranjeev

Sampling in Official GMAT CR Questions

Prep Toolkit

Top 5 GMAT Debrief Videos

615 to 715 in 15 Days (Ria's Strategies)

More than Quant/Verbal Abilities: Speed vs. Accuracy

745 with Only Self-Prep and GMAT Club

From 555 to 765: 210-point Improvement

Perfect 805 Score Debrief by Julia

My Rewards

GMAT Critical Reasoning (CR) Questions

Sampling in Official GMAT CR Questions

Prep Toolkit

615 to 715 in 15 Days (Ria's Strategies)

More than Quant/Verbal Abilities: Speed vs. Accuracy

745 with Only Self-Prep and GMAT Club

From 555 to 765: 210-point Improvement

Perfect 805 Score Debrief by Julia

My Rewards