Last visit was: 24 Apr 2024, 20:58 It is currently 24 Apr 2024, 20:58

Close
GMAT Club Daily Prep
Thank you for using the timer - this advanced tool can estimate your performance and suggest more practice questions. We have subscribed you to Daily Prep Questions via email.

Customized
for You

we will pick new questions that match your level based on your Timer History

Track
Your Progress

every week, we’ll send you an estimated GMAT score based on your performance

Practice
Pays

we will pick new questions that match your level based on your Timer History
Not interested in getting valuable practice questions and articles delivered to your email? No problem, unsubscribe here.
Close
Request Expert Reply
Confirm Cancel
SORT BY:
Date
Tags:
Show Tags
Hide Tags
Math Expert
Joined: 02 Sep 2009
Posts: 92900
Own Kudos [?]: 618837 [513]
Given Kudos: 81588
Send PM
Most Helpful Reply
Math Expert
Joined: 02 Sep 2009
Posts: 92900
Own Kudos [?]: 618837 [74]
Given Kudos: 81588
Send PM
Intern
Intern
Joined: 06 Jul 2018
Posts: 2
Own Kudos [?]: 70 [69]
Given Kudos: 10
Send PM
Math Expert
Joined: 02 Sep 2009
Posts: 92900
Own Kudos [?]: 618837 [67]
Given Kudos: 81588
Send PM
Re: Statistics Made Easy - All in One Topic! [#permalink]
35
Kudos
32
Bookmarks
Expert Reply
Some Mean Questions

BY KARISHMA, VERITAS PREP


I hope the theory of arithmetic mean we discussed above is clear to you. Let’s see the theory in action today. I will pick some mean questions from various sources (Official Guide, GMAT prep tests, etc.) and we will try to use the concepts we learned last week to solve them.

Let’s start with a simple question.

Question 1: For the past n days the average daily production at a company was 60 units. If today’s production of 100 units raises the average to 65 units per day, what is the value of n?

(A) 30
(B) 18
(C) 10
(D) 9
(E) 7

Solution: If today’s production were also 60 units, what would have happened to the average? Obviously, it would have stayed the same! But today’s production is 40 units extra and hence it raised the average. It raised the average by 5 units which means that each one of the n observations and today’s observation got an extra 5. Since 40 got distributed and each was given 5, there must have been a total of 40/5 = 8 observations including today’s. Therefore, the value of n must have been 8 – 1 = 7.

Answer (E) This question is discussed HERE.

I know you can solve the question using the formula of averages. In fact, you can solve every question using the formula and working out the values. But the point is that the logical method helps you solve the question very quickly and you are less likely to make calculation errors since there aren’t too many calculations to perform! Let’s go on now.

Question 2: When Anna makes a contribution to a charity fund at school, the average contribution size increases by 50%, reaching $75 per person. If there were 5 other contributions made before Anna’s, what is the size of her donation?

(A) $100
(B) $150
(C) $200
(D) $250
(E) $450

Solution: After Anna’s contribution, the average size increases by 50% and reaches $75. What must have been the average size of contribution before Anna’s donation? It must have been $50 since a 50% increase would lead us to $75. So, $50 was the average size of 5 donations before Anna made her donation. Had Anna donated $50 as well, the average would have stayed the same i.e. $50. But the average increased to $75 which means that Anna donated an extra $25 for each of the 6 observations (including her) in addition to the $50 she would have donated to keep the average same.

Hence, the amount Anna donated = 50 + 6*25 = $200

Answer (C) This question is discussed HERE.

Again, this was a relatively straight forward question. Let’s look at a tricky one now.

Question 3: A set of numbers has an average of 50. If the largest element is 4 greater than 3 times the smallest element, which of the following values cannot be in the set?

(A) 85
(B) 90
(C) 123
(D) 150
(E) 155

Solution: This question might look a little ominous but it isn’t very tough, really! The set has an average of 50 so that already tells us that we can represent each element of the set by 50. If there is an element which is a little less than 50, there will be another element which is a little more than 50.

The largest element is 4 greater than 3 times the smallest element so L = 4 + 3S.

The smallest element must be less than 50 and the largest must be greater than 50. Say, if the smallest element is 20, the largest will be 4 + 3*20 = 64.

Is there any limit imposed on the largest value of the largest element? Yes, because there is a limit on the largest value of the smallest element. The smallest element must be less than 50. The smallest member of the set can be 49.9999… The limiting value of the smallest number is 50. As long as the smallest number is a tiny bit less than 50, you can have the greatest number a tiny bit less than 4 + 3*50 = 154. The number 154 and all numbers greater than 154 cannot be a part of the set. Say if the smallest element is 49, the largest element will be 4 + 3*49 = 151. So the set could look something like this:

S = {49, 49, 49, 49, … (101 times to balance out the extra 101 in 151), 50, 50, 151}

Only option (E) cannot be a part of the set. This question is discussed HERE.

These were some of the basic (and not so basic) questions of mean that we could come across in GMAT. We will look at some more stats concepts in next post. Till then, keep practicing!
Math Expert
Joined: 02 Sep 2009
Posts: 92900
Own Kudos [?]: 618837 [57]
Given Kudos: 81588
Send PM
Re: Statistics Made Easy - All in One Topic! [#permalink]
20
Kudos
37
Bookmarks
Expert Reply
Application of Arithmetic Means

BY KARISHMA, VERITAS PREP


In the above post we discussed arithmetic means of arithmetic progressions in GMAT math problems. Now, let’s see those concepts in action.

Question 1: If x is the sum of the even integers from 200 to 600 inclusive, and y is the number of even integers from 200 to 600 inclusive, what is the value of x + y?

(A) 200*400
(B) 201*400
(C) 200*402
(D) 201*401
(E) 400*401

Solution:

There are various ways of getting the answer here. We will use the concepts we learned last week.

The given sequence is 200, 202, 204, … 600

It is an arithmetic progression. What is the total number of terms here?

You can use one of two methods to get the number of terms here:

Method 1: Using Logic

In every 100 consecutive integers, there are 50 odd integers and 50 even integers. So we will get 50 even integers from each of 200 – 299, 300 – 399, 400 – 499 and 500 – 599 i.e. a total of 50*4 = 200 even integers. Also, since the sequence includes 600, number of even integers = 200 + 1 = 201

Method 2:

Recall that in our arithmetic progressions post, we saw that the last term of a sequence which has n terms will be first term + (n – 1)* common difference.

\(600 = 200 + (n – 1)*2\)

\(n = 201\)

Hence \(y = 201\) (because y is the number of even integers from 200 to 600)

Let’s go on now. What is the average of the sequence? Since it is an arithmetic progression with odd number of integers, the average must be the middle number i.e. 400.

Notice that since this arithmetic progressions looks like this:

(n – m), … (n – 6), (n – 4), ( n – 2), n, (n + 2), (n + 4), (n + 6), … (n + m)

We can find the middle number i.e. the average by just averaging the first and the last terms.

\(\frac{(n – m) + (n + m)}{2} = \frac{2n}{2} = n\)

\(Average = \frac{(200 + 600)}{2} = 400\)

Sum of all terms in the sequence = x = Arithmetic Mean * Number of terms = 400*201

\(x + y = 400*201 + 201 = 401*201\)

Answer (D) This question is discussed HERE.

This question was simple. You could have found the sum using the formula \(\frac{n}{2}*(2a + (n-1)d)\) that we saw in the AP post. But this method is more intuitive since if you don’t want to, you don’t have to use any formula here. Anyway, let’s go on to our second question for today.

Question 2: The sum of n consecutive positive integers is 45. What is the value of n?

Statement I: n is even

Statement II: n < 9

Solution: First I will give the solution of this question and then discuss the logic used to solve it.

In how many ways can you write n consecutive integers such that their sum is 45? Let’s see whether we can get such numbers for some values of n.

n = 1 -> Numbers: 45
n = 2 -> Numbers: 22 + 23 = 45
n = 3 -> Numbers: 14 + 15 + 16 = 45
n = 4 -> No such numbers
n = 5 -> Numbers: 7 + 8 + 9 + 10 + 11 = 45
n = 6 -> Numbers: 5 + 6 + 7 + 8 + 9 + 10 = 45

Let’s stop right here.

Statement I: n must be even.

n could be 2 or 6. Statement I alone is not sufficient.

Statement II: n < 9
n can take many values less than 9 hence statement 2 alone is not sufficient.

Both statements together: Since n can take values 2 or 6 which are even and less than 9, both statements together are not sufficient.

Answer (E) This question is discussed HERE.

Now, the interesting thing is how do we get these numbers for different values of n. How do we know the values that n can take? It’s pretty easy really. Follow my thought here.

Of course, n can be 1. In that case we have only one number i.e. 45.

n can be 2. Why? When we divide 45 by 2, we get 22.5. Since 2*22.5 is 45, we have to find 2 consecutive integers such that their arithmetic mean is 22.5. The integers are obviously 22 and 23.

n can be 3. When we divide 45 by 3, we get 15. So we need 3 consecutive integers such that their mean is 15. They are 14, 15, 16.

When we divide 45 by 4, we get 11.25. Do we have 4 consecutive integers such that their mean is 11.25? No, because mean of even number of consecutive integers is always of the form x.5.

n can be 5. When we divide 45 by 5, we get 9 so we need 5 consecutive integers such that their mean is 9. They must be 7, 8, 9, 10, 11.

n can be 6. When we divide 45 by 6, we get 7.5. We need 6 consecutive integers such that their mean is 7.5. The integers are 5, 6, 7, 8, 9, 10

Obviously, we just need to focus on getting 2 even values of n which are less than 9. So we check for 2, 4 and 6 and we immediately know that the answer is (E). We don’t have to do this process for all numbers less than 9 and we don’t have to do it for odd values of n.

We will move on to median in the next post. Till then, keep practicing!
Math Expert
Joined: 02 Sep 2009
Posts: 92900
Own Kudos [?]: 618837 [53]
Given Kudos: 81588
Send PM
Statistics Made Easy - All in One Topic! [#permalink]
10
Kudos
43
Bookmarks
Expert Reply
A 750 Level GMAT Question on Statistics!

BY KARISHMA, VERITAS PREP


In this post, we have a very interesting statistics question for you. Above, we have already discussed statistics concepts such as mean, median, range.

This question needs you to apply all these concepts but can still be easily done in under two minutes. Now, without further ado, let’s go on to the question – there is a lot to discuss there.

Question:An automated manufacturing unit employs N experts such that the range of their monthly salaries is $10,000. Their average monthly salary is $7000 above the lowest salary while the median monthly salary is only $5000 above the lowest salary. What is the minimum value of N?

(A)10
(B)12
(C)14
(D)15
(E)20

Solution: Let’s first assimilate the information we have. We need to find the minimum number of experts that must be there. Why should there be a minimum number of people satisfying these statistics? Let’s try to understand that with some numbers.

Say, N cannot be 1 i.e. there cannot be a single expert in the unit because then you cannot have the range of $10,000. You need at least two people to have a range – the difference of their salaries would be the range in that case.

So there are at least 2 people – say one with salary 0 and the other with 10,000. No salary will lie outside this range.

Median is $5000 – i.e. when all salaries are listed in increasing order, the middle salary (or average of middle two) is $5000. With 2 people, one at 0 and the other at 10,000, the median will be the average of the two i.e. (0 + 10,000)/2 = $5000. Since there are at least 10 people, there is probably someone earning $5000. Let’s put in 5000 there for reference.

0 … 5000 … 10,000

Arithmetic mean of all the salaries is $7000. Now, mean of 0, 5000 and 10,000 is $5000, not $7000 so this means that we need to add some more people. We need to add them more toward 10,000 than toward 0 to get a higher mean. So we will try to get a mean of $7000.

Let’s use deviations from the mean method to find where we need to add more people.

0 is 7000 less than 7000 and 5000 is 2000 less than 7000 which means we have a total of $9000 less than 7000. On the other hand, 10,000 is 3000 more than 7000. The deviations on the two sides of mean do not balance out. To balance, we need to add two more people at a salary of $10,000 so that the total deviation on the right of 7000 is also $9000. Note that since we need the minimum number of experts, we should add new people at 10,000 so that they quickly make up the deficit in the deviation. If we add them at 8000 or 9000 etc, we will need to add more people to make up the deficit at the right.

Now we have

0 … 5000 … 10000, 10000, 10000

Now the mean is 7000 but note that the median has gone awry. It is 10,000 now instead of the 5000 that is required. So we will need to add more people at 5000 to bring the median back to 5000. But that will disturb our mean again! So when we add some people at 5000, we will need to add some at 10,000 too to keep the mean at 7000.

5000 is 2000 less than 7000 and 10,000 is 3000 more than 7000. We don’t want to disturb the total deviation from 7000. So every time we add 3 people at 5000 (which will be a total deviation of 6000 less than 7000), we will need to add 2 people at 10,000 (which will be a total deviation of 6000 more than 7000), to keep the mean at 7000 – this is the most important step. Ensure that you have understood this before moving ahead.

When we add 3 people at 5000 and 2 at 10,000, we are in effect adding an extra person at 5000 and hence it moves our median a bit to the left.

Let’s try one such set of addition:

0 … 5000, 5000, 5000, 5000 … 10000, 10000, 10000, 10000, 10000

The median is not $5000 yet. Let’s try one more set of addition.

0 … 5000, 5000, 5000, 5000, 5000, 5000, 5000 … 10000, 10000, 10000, 10000, 10000, 10000, 10000

The median now is $5000 and we have maintained the mean at $7000.

This gives us a total of 15 people.

Answer (D) This question is discussed HERE.

Granted, the question is tough but note that it uses very basic concepts and that is the hallmark of a good GMAT question!

Try to come up with some other methods of solving this.
Math Expert
Joined: 02 Sep 2009
Posts: 92900
Own Kudos [?]: 618837 [41]
Given Kudos: 81588
Send PM
Re: Statistics Made Easy - All in One Topic! [#permalink]
13
Kudos
28
Bookmarks
Expert Reply
Finding Arithmetic Mean Using Deviations

BY KARISHMA, VERITAS PREP


In this post is again focused on arithmetic mean. Let’s start our discussion by considering the case of arithmetic mean of an arithmetic progression.

We will start with an example. What is the mean of 43, 44, 45, 46, 47? (Hint: If you are thinking about adding the numbers, that’s not the way I want you to go.)

As we discussed in our previous posts, arithmetic mean is the number that can represent/replace all the numbers of the sequence. Notice in this sequence, 44 is one less than 45 and 46 is one more than 45. So essentially, two 45s can replace both 44 and 46. Similarly, 43 is 2 less than 45 and 47 is 2 more than 45 so two 45s can replace both these numbers too.

The sequence is essentially 45, 45, 45, 45, 45.

Hence, the arithmetic mean of this sequence must be 45! (If you have doubts, you can calculate and find out.)

It makes sense, doesn’t it? The middle number in the sequence of consecutive positive integers will be the mean. The deviations of all numbers to the left of the middle number will balance out the deviations of all the numbers to the right of the middle number.

(In this post, we will assume that the given numbers are in increasing/decreasing order. If that is not the case, you can always put them in increasing order and use these concepts.)

Once again, what is the mean of 192, 193, 194, 195, 196, 197, 198?

It is 195 since it is the middle number!

Ok, what about 192, 193, 194, 195, 196, 197? What is the mean in this case? There is no middle number here since there are 6 numbers. The mean here will be the middle of the two middle numbers which is 194.5 (the middle of the third and the fourth number). It doesn’t matter that 194.5 is not a part of this list. If you think about it, arithmetic mean of some numbers needn’t be one of the numbers.

What about 71, 73, 75, 77, 79? What will be the mean in this case? Even though these numbers are not consecutive integers, the difference between two adjacent numbers in the list is the same (it is an arithmetic progression). So the deviations of the numbers on the left of the middle number will cancel out the deviations of the numbers on the right of the middle number (71 is 4 less than 75 and 79 is 4 more than 75. 73 is 2 less than 75 and 77 is 2 more than 75). Hence, the mean here will be 75 (just like our first example).

Just to re-inforce:

102, 106, 110 –> Mean = 106

102, 106, 110, 114 -> Mean = 108 (Middle of the second and third numbers)

Let’s twist this concept a little now. What is the mean of 36, 40, 42, 43, 44, 47?

This is not an arithmetic progression. So do we need to sum and then divide by 6 to get the mean? Not so fast! Let’s try and use the deviations concept we have just learned.

Given sequence: 36, 40, 42, 43, 44, 47

It seems that the mean would be around 42, right? Some numbers are less than 42 and others are more than 42.

36 is 6 less than 42.

40 is 2 less than 42.

Overall, the numbers less than 42 are 6+2 = 8 less than 42.

43 is 1 more than 42.

44 is 2 more than 42.

47 is 5 more than 42

Overall, the numbers more than 42 are 1+2+5 = 8 more than 42.

The deviations of the numbers less than 42 get balanced out by deviations of the numbers greater than 42! Hence, the average must be 42.

This method is especially useful in cases involving big numbers which are close to each other.

Example 1: What is the average of 452, 453, 463, 467, 480, 499, 504?

What would you say the average is here? Perhaps, around 470?

Let’s see:

452 is 18 less than 470.

453 is 17 less than 470.

463 is 7 less than 470.

467 is 3 less than 470.

Overall, the numbers less than 470 are 18 + 17 + 7 + 3 = 45 less.

480 is 10 more than 470.

499 is 29 more than 470.

504 is 34 more than 470.

Overall, the numbers more than 470 are 10 + 29 + 34 = 73 more than 470.

The shortfall is not balanced by the excess. There is an excess of 73 – 45 = 28.

So what is the average? If we assume the average of these 7 numbers to be 470, there is an excess of 28. We need to distribute the excess evenly among all the numbers and hence the average will increase by 28/7 = 4. (Go back to the first post on arithmetic mean if this is not clear.)

Hence, the required mean is 470 + 4 = 474.

(If we had assumed the mean to be 474, the shortfall would have balanced the excess.)

Let’s go through one more example using this concept:

Example 2: What is the mean of 99, 103, 104, 109, 120, 123, 128, 130?

Let’s start by guessing a mean for this sequence. Say, around 115?

Let’s see if the shortfall is balanced by the excess.

99 is 16 less, 103 is 12 less, 104 is 11 less and 109 is 6 less than 115.

Overall shortfall = 16 + 12 + 11 + 6 = 45

120 is 5 more, 123 is 8 more, 128 is 13 more and 130 is 15 more than 115.

Overall excess = 5 + 8 + 13 + 15 = 41

We are close, but not quite there yet! There is a shortfall of 4. Since there are a total of 8 numbers, the average must be 4/8 = 0.5 less than 115. Hence, the average here is 114.5

Once you get a hang of this method and understand what you are doing, it is much faster than adding all the big numbers and then dividing the sum since you only deal with small numbers in this method.

Let’s wrap up this post here. In the next post, we will see these concepts in action!
Math Expert
Joined: 02 Sep 2009
Posts: 92900
Own Kudos [?]: 618837 [38]
Given Kudos: 81588
Send PM
Re: Statistics Made Easy - All in One Topic! [#permalink]
10
Kudos
28
Bookmarks
Expert Reply
Some Tricky Standard Deviation Questions

BY KARISHMA, VERITAS PREP


In the above post, we promised you a couple of tricky standard deviation (SD) GMAT questions. We start with a 600-700 level question and then look at a 700 – 800 level one.

Question 1: During an experiment, some water was removed from each of the 8 water tanks. If the standard deviation of the volumes of water in the tanks at the beginning of the experiment was 20 gallons, what was the standard deviation of the volumes of water in the tanks at the end of the experiment?

Statement 1: For each tank, 40% of the volume of water that was in the tank at the beginning of the experiment was removed during the experiment.

Statement 2: The average volume of water in the tanks at the end of the experiment was 80 gallons.

Solution:

We have 8 water tanks. This implies that we have 8 elements in the set (volume of water in each of the 8 tanks). SD of the volume of water in the tanks is 20 gallons. We need to find the new SD i.e. the SD after water was removed from the tanks.

Statement 1: For each tank, 40% of the volume of water that was in the tank at the beginning of the experiment was removed during the experiment.

Initial SD is 20. When 40% of the water is removed from each tank, the leftover water is 60% of the initial volume of water i.e. 0.6*initial volume of water. This means that each element of the initial set was multiplied by 0.6 to obtain the new set. The SD will change. It will become 0.6*previous SD i.e. 0.6*20 = 12 (think of the formula of SD we discussed in the first SD post). This statement alone is sufficient.

Statement 2: The average volume of water in the tanks at the end of the experiment was 80 gallons.

The average volume doesn’t give us the SD of the new set. Hence, this statement alone is not sufficient.

Answer (A) This question is discussed HERE.

Now that we are done with the easier one, let’s go on to the tougher one.

Question 2: M is a collection of four odd integers. The range of set M is 4. How many distinct values can standard deviation of M take?

(A) 3
(B) 4
(C) 5
(D) 6
(E) 7

Solution:

Since the range of M is 4, it means the greatest difference between any two elements is 4. One way of doing this will be M = {1, x, y, 5} (obviously, there are innumerable ways of writing M)
Here, x and y can take one of 3 different values: 1, 3 and 5 (x and y cannot be less than 1 or greater than 5 because the range of the set is 4).

Both x and y could be same. This can be done in 3 ways. Or x and y could be different. This can be done in 3C2 = 3 ways. Total x and y can take values in 3 + 3 = 6 ways.

(Note here that the number of ways in which you can select x and y is not 3*3 = 9. Why?)

For clarification, let me enumerate the 6 ways in which you can get the desired set:
{1, 1, 1, 5}, {1, 3, 3, 5}, {1, 5, 5, 5}, {1, 1, 3, 5}, {1, 1, 5, 5}, {1, 3, 5, 5}

Note here that standard deviations of {1, 1, 1, 5} and {1, 5, 5, 5} are same. Why? Because SD measures deviation from mean. It has nothing to do with the actual value of mean and actual value of numbers.

Mean of {1, 1, 1, 5} is 2. Three of the numbers are distance 1 away from mean and one number is distance 3 away from mean. Mean of {1, 5, 5, 5} is 4. Three of the numbers are distance 1 away from mean and one number is distance 3 away from mean. Sum of the squared deviations will be the same in both the cases and the number of elements is also the same in both the cases. Therefore, both these sets will have the same SD.

Similarly, {1, 1, 3, 5} and {1, 3, 5, 5} will have the same SD.

From the leftover sets, {1, 3, 3, 5} will have a distinct SD and {1, 1, 5, 5} will have a distinct SD.

In all, there are 4 different values that SD can take in such a case.

Note: It doesn’t matter what the actual numbers are. Since we have found 4 distinct values for SD, we will always have 4 distinct values of SD for a set under the given constraints.

Answer (B) This question is discussed HERE.

Hope the question was fun for you too!
Math Expert
Joined: 02 Sep 2009
Posts: 92900
Own Kudos [?]: 618837 [29]
Given Kudos: 81588
Send PM
Re: Statistics Made Easy - All in One Topic! [#permalink]
10
Kudos
19
Bookmarks
Expert Reply
A Range of Questions

BY KARISHMA, VERITAS PREP


Let’s discuss the idea of “range” today. It is simply the difference between the smallest and the greatest number in a set. Consider the following examples:

Range of {2, 6, 10, 25, 50} is 50 – 2 = 48

Range of {-20, 100, 80, 30, 600} is 600 – (-20) = 620

and so on…

That’s all the theory we have on the concept of range! So let’s jump on to some questions now (therein lies the challenge)!

Question 1: Which of the following cannot be the range of a set consisting of 5 odd multiples of 9?

(A) 72
(B) 144
(C) 288
(D) 324
(E) 436

Solution:

There are infinite possibilities regarding the multiples of 9 that can be included in the set. The set could be any one of the following (or any one of the other infinite possibilities):

S = {9, 27, 45, 63, 81} or

S = {9, 63, 81, 99, 153} or

S = {99, 135, 153, 243, 1071}

The range in each case will be different. The question asks us for the option that ‘cannot’ be the range. Let’s figure out the constraints on the range.

A set consisting of only odd multiples of 9 will have a range that is an even number (Odd Number – Odd Number = Even number)
Also, the range will be a multiple of 9 since both, the smallest and the greatest numbers, will be multiples of 9. So their difference will also be a multiple of 9.

Only one option will not satisfy these constraints. Do you remember the divisibility rule of 9? The sum of the digits of the number should be divisible by 9 for the number to be divisible by 9. The sum of the digits of 436 is 4 + 3 + 6 = 13 which is not divisible by 9. Hence 436 cannot be divisible by 9 and therefore, cannot be the range of the set.

Answer (E). This question is discussed HERE.

On to another one now:

Question 2: If the arithmetic mean of n consecutive odd integers is 20, what is the greatest of the integers?

(1) The range of the n integers is 18.

(2) The least of the n integers is 11.

Solution: We have discussed mean in case of arithmetic progressions in the previous posts. If mean of consecutive odd integers is 20, what do you think the integers will look like?

19, 21 or
17, 19, 21, 23 or
15, 17, 19, 21, 23, 25 or
13, 15, 17, 19, 21, 23, 25, 27 or
11, 13, 15, 17, 19, 21, 23, 25, 27, 29
etc.

Does it make sense that the required numbers will represent one such sequence? The numbers in the sequence will be equally distributed around 20. Every time you add a number to the left, you need to add one to the right to keep the mean 20. The smallest sequence will have 2 numbers 19 and 21, the largest will have infinite numbers. Did you notice that each one of these sequences has a unique “range,” a unique “least number” and a unique “greatest number?” So if you are given any one statistic of the sequence, you will know the entire desired sequence.

Statement 1: Only one possible sequence: 11, 13, 15, 17, 19, 21, 23, 25, 27, 29 will have the range 18. The greatest number here is 29. This statement alone is sufficient.

Statement 2: Only one possible sequence: 11, 13, 15, 17, 19, 21, 23, 25, 27, 29 will have 11 as the least number. The greatest number here is 29. This statement alone is sufficient too.

Answer (D). This question is discussed HERE.

Note that you don’t actually have to find the exact sequence. All you need to understand is that each sequence will have a unique “range” and a unique “least number.”
Math Expert
Joined: 02 Sep 2009
Posts: 92900
Own Kudos [?]: 618837 [22]
Given Kudos: 81588
Send PM
Re: Statistics Made Easy - All in One Topic! [#permalink]
10
Kudos
12
Bookmarks
Expert Reply
Dealing with Standard Deviation II

BY KARISHMA, VERITAS PREP


In this post, we pick from where we left in the post above. Let’s discuss the last 3 cases first.

Question: Which set, S or T, has higher SD?

Case 5: S = {1, 3, 5} or T = {1, 3, 3, 5}

The standard deviation (SD) of T will be less than the SD of S. Why? The mean of 1, 3 and 5 is 3. If you add another 3 to the list, the mean stays the same and the sum of the squared deviations is also the same but the number of elements increases. Hence, the SD decreases.

Case 6: S = {6, 8, 10} or T = {12, 16, 20}

Put the numbers on the number line. You will see that the SD of T is greater than the SD of S. When you multiply each element of a set by the same number (T is obtained by multiplying each element of S by 2), the SD increases.

Case 7: S = {6, 8, 10} or T = {3, 4, 5}

Put the numbers on the number line. You will see that the SD of T is less than the SD of S. When you divide each element of a set by the same number (T is obtained by dividing each element of S by 2 OR you can say that S is obtained by multiplying each element of T by 2), the SD decreases.

Now that we have an understanding of how SD behaves, let’s look at a question.

Question 1: A certain list of 300 test scores has an arithmetic mean of 75 and a standard deviation of d, where d is positive. Which of the following two test scores, when added to the list, must result in a list of 302 test scores with a standard deviation less than d?

(A) 75 and 80
(B) 80 and 85
(C) 70 and 75
(D) 75 and 75
(E) 70 and 80

Solution: As discussed above, the standard deviation of a set measures the deviation from the mean. A low standard deviation indicates that the data points are very close to the mean whereas a high standard deviation indicates that the data points are spread far apart from the mean.

When we add numbers that are far from the mean, we are stretching the set and hence, increasing the SD. When we add numbers which are close to the mean, we are shrinking the set and hence, decreasing the SD.

Therefore, adding two numbers which are closest to the mean will shrink the set the most, thus decreasing SD by the greatest amount.

Numbers closest to the mean are 75 and 75 (they are equal to the mean) and thus adding them will decrease SD the most.

Answer: D. This question is discussed HERE.

Now that we have seen that difficult looking questions on SD can be quite simple, I want you to think about something – when you add some new numbers to a set, how do you decide whether SD increases or decreases? If you notice, we have seen two different cases (case 4 and case 5) – in one of them SD increases when you add two numbers to the set and in the other, SD decreases. So how do you decide whether SD will increase or decrease? Say, what happens in case S = {3, 4, 5, 6, 7} and T = {3, 4, 4, 5, 6, 6, 7}? Will SD increase or decrease in this case? How do you decide the point at which the increase in the numerator offsets the increase in the denominator?

Meanwhile, let’s look at one more question.

Question 2: If 100 is included in each of sets A, B and C (given A= {30, 50, 70, 90, 110}, B = {-20, -10, 0, 10, 20} and C= {30, 35, 40, 45, 50}), which of the following represents the correct ordering (largest to smallest) of the sets in terms of the absolute increase in their standard deviation?

(A) A, C, B
(B) A, B, C
(C) C, A, B
(D) B, A, C
(E) B, C, A

Solution: The question looks a little convoluted but actually you don’t have to calculate anything. SD measures the deviation of the elements from the mean. If a new element is added which is far away from the mean, it will add much more to the deviations than if it were added close to the mean.
The means of A, B and C are 70, 0 and 40, respectively.
100 is farthest from 0 so it will change the SD of set B the most (in terms of absolute increase). It is closest to 70 so it will change the SD of set A the least. Hence the correct ordering is B, C, A.

Answer (E) This question is discussed HERE.

Simple enough, right? SD questions are generally straight forward once you understand the basics well.
Math Expert
Joined: 02 Sep 2009
Posts: 92900
Own Kudos [?]: 618837 [22]
Given Kudos: 81588
Send PM
Re: Statistics Made Easy - All in One Topic! [#permalink]
7
Kudos
15
Bookmarks
Expert Reply
3 Important Concepts for Statistics Questions on the GMAT

BY KARISHMA, VERITAS PREP


We have discussed these three concepts of statistics in detail:

Arithmetic mean is the number that can represent/replace all the numbers of the sequence. It lies somewhere in between the smallest and the largest values.

Median is the middle number (in case the total number of numbers is odd) or the average of two middle numbers (in case the total number of numbers is even).

Standard deviation is a measure of the dispersion of the values around the mean.

A conceptual question is how these three measures change when all the numbers of the set are varied is a similar fashion.

For example, how does the mean of a set change when all the numbers are increased by say, 10? How does the median change? And what about the standard deviation? What happens when you multiply each element of a set by the same number?

Let’s discuss all these cases in detail but before we start, we would like to point out that the discussion will be conceptual. We will not get into formulas though you can arrive at the answer by manipulating the respective formulas.

When you talk about mean or median or standard deviation of a list of numbers, imagine the numbers lying on the number line. They would be spread on the number line in a certain way. For example,

——0—a———b—c———————d———e————————f—g———————

Case I:

When you add the same positive number (say x) to all the elements, the entire bunch of numbers moves ahead together on the number line. The new numbers a’, b’, c’, d’, e’, f’ and g’ would look like this

——0——————a’———b’—c’———————d’———e’————————f’—g’——————

The relative placement of the numbers does not change. They are still at the same distance from each other. Note that the numbers have moved further to the right of 0 now to show that they have moved ahead on the number line.

The mean lies somewhere in the middle of the bunch and will move forward by the added number. Say, if the mean was d, the new mean will be \(d’ = d + x\).

So when you add the same number to each element of a list, 

New mean = Old mean + Added number.

On similar lines, the median is the middle number (d in this case) and will move ahead by the added number. The new median will be \(d’ = d + x\)

So when you add the same number to each element of a list, 

New median = Old median + Added number

Standard deviation is a measure of dispersion of the numbers around the mean and this dispersion does not change when the whole bunch moves ahead as it is. Standard deviation does not depend on where the numbers lie on the number line. It depends on how far the numbers are from the mean. So standard deviation of 3, 5, 7 and 9 is the same as the standard deviation of 13, 15, 17 and 19. The relative placement of the numbers in both the cases will be the same. Hence, if you add the same number to each element of a list, the standard deviation will stay the same.

Case II:

Let’s now move on to the discussion of multiplying each element by the same positive number.

The original placing of the numbers on the number line looked like this:

——0—a———b—c———————d———e————————f—g———————

The new placing of the numbers on the  number line will look something like this:

——0———a’——————b’———c’————————————d’—————————e—- etc

The numbers spread out. To understand this, take an example. Say, the initial numbers were 10, 20 and 30. If you multiply each number by 2, the new numbers are 20, 40 and 60. The difference between them has increased from 10 to 20.

If you multiply each number by x, the mean also gets multiplied by x. So, if d was the mean initially, d’ will be the new mean which is \(x*d\).

New mean = Old mean * Multiplied number

Similarly, the median will also get multiplied by x.

New median = Old median * Multiplied number

What happens to standard deviation in this case? It changes! Since the numbers are now further apart from the mean, their dispersion increases and hence the standard deviation also increases. The new standard deviation will be x times the old standard deviation. You can also establish this using the standard deviation formula.

New standard deviation = Old standard deviation * Multiplied number

The same concept is applicable when you increase each number by the same percentage. It is akin to multiplying each element by the same number. Say, if you increase each number by 20%, you are, in effect, multiplying each number by 1.2. So our case II applies here.

Now, think about what happens when you subtract/divide each element by the same number.
Math Expert
Joined: 02 Sep 2009
Posts: 92900
Own Kudos [?]: 618837 [21]
Given Kudos: 81588
Send PM
Re: Statistics Made Easy - All in One Topic! [#permalink]
4
Kudos
17
Bookmarks
Expert Reply
Solving GMAT Standard Deviation Problems By Using as Little Math as Possible

BY David Goldstein, VERITAS PREP


The other night I taught our Statistics lesson, and when we got to the section of class that deals with standard deviation, there was a familiar collective groan – not unlike the groan one encounters when doing compound interest, or any mathematical concept that, when we learned it in school, involved an intimidating-looking formula.

So, I think it’s time for me to coin an axiom: the more painful the traditional formula associated with a given topic, the simpler the actual calculations will be on the GMAT. (Please note, though the axiom is awaiting official mathematical verification by Veritas’ hard-working team of data scientists, the anecdotal evidence in support of the axiom is overwhelming.)

So, let’s talk standard deviation. If you’re like my students, your first thought is to start assembling a list of increasingly frantic questions: Do we need to know that horrible formula I learned in Stats class? (No.) Do we need to know the relationship between variance and Standard deviation? (You just need to know that there is a relationship, and that if you can solve for one, you can solve for the other.) Etc.

So, rather than droning on about what we don’t need to know, let’s boil down what we do need to know about standard deviation. The good news – it isn’t much. Just make sure you’ve internalized the following:

    * The standard deviation is a measure of the dispersion the elements of the set around mean. The farther away the terms are from the mean, the larger the standard deviation.
    * If we were to increase or decrease each element of the set by “x,” the standard deviation would remain unchanged.
    * If we were to multiply each element of the set by “x,” the standard deviation would also be multiplied by “x.”
    * If the mean of a set is “m” and the standard deviation is “d,” then to say that something is within 3 standard deviations of a set is to say that it falls within the interval of (m – 3d) to (m + 3d.) And to say that something is within 2 standard deviations of the mean is to say that it falls within the interval of (m – 2d) to (m + 2d.

That’s basically it. Not anything to get too worked up about. So, let’s see some of these principles in action to substantiate the claim that we won’t have to do too much arithmetical grinding on these types of questions:

If d is the standard deviation of x, y, z, what is the standard deviation of x+5, y+5, z+5 ?
A) d
B) 3d
C) 15d
D) d+5
E) d+15

If our initial set is x, y, z, and our new set is x+5, y+5, and z+5, then we’re adding the same value to each element of the set. We already know that adding the same value to each element of the set does not change the standard deviation. Therefore, if the initial standard deviation was d, the new standard deviation is also d. We’re done – the answer is A. (You can see this with a simple example. If your initial set is {1, 2, 3} and your new set is {6, 7, 8} the dispersion of the set clearly hasn’t changed.) This question is discussed HERE.

Surely the questions get harder than this, you say. They do, but if you know the aforementioned core concepts, they’re all quite manageable. Here’s another one:

Some water was removed from each of 6 tanks. If standard deviation of the volumes of water at the beginning was 10 gallons, what was the standard deviation of the volumes at the end?

1) For each tank, 30% of water at the beginning was removed
2) The average volume of water in the tanks at the end was 63 gallons


We know the initial standard deviation. We want to know if it’s possible to determine the new standard deviation after water is removed. To the statements we go!

Statement 1: If 30% of the water is removed from each tank, we know that each term in the set is multiplied by the same value: 0.7. Well, if each term in a set is multiplied by 0.7, then the standard deviation of the set is also multiplied by 0.7. If the initial standard deviation was 10 gallons, then the new standard deviation would be 10*(0.7) = 7 gallons. And we don’t even need to do the math – it’s enough to see that it’s possible to calculate this number. Therefore, Statement 1 alone is sufficient.

Statement 2: Knowing the average of a set is not going to tell us very much about the dispersion of the set. To see why, imagine a simple case in which we have two tanks, and the average volume of water in the tanks is 63 gallons. It’s possible that each tank has exactly 63 gallons and, if so, the standard deviation would be 0, as everything would equal the mean. It’s also possible to have one tank that had 126 gallons and another tank that was empty, creating a standard deviation that would, of course, be significantly greater than 0. So, simply knowing the average cannot possibly give us our standard deviation. Statement 2 alone is not sufficient to answer the question.

And the answer is A. This question is discussed HERE.

Maybe at this point you’re itching for more of a challenge. Let’s look at a slightly tougher one:

7.51; 8.22; 7.86; 8.36
8.09; 7.83; 8.30; 8.01
7.73; 8.25; 7.96; 8.53

A vending machine is designed to dispense 8 ounces of coffee into a cup. After a test that recorded the number of ounces of coffee in each of 1000 cups dispensed by the vending machine, the 12 listed amounts, in ounces, were selected from the data above. If the 1000 recorded amounts have a mean of 8.1 ounces and a standard deviation of 0.3 ounces, how many of the 12 listed amounts are within 1.5 standard deviations of the mean?

A)Four
B) Six
C) Nine
D) Ten
E) Eleven

Okay, so the standard deviation is 0.3 ounces. We want the values that are within 1.5 standard deviations of the mean. 1.5 standard deviations would be (1.5)(0.3) = 0.45 ounces, so we want all of the values that are within 0.45 ounces of the mean. If the mean is 8.1 ounces, this means that we want everything that falls between a lower bound of (8.1 – 0.45) and an upper bound of (8.1 + 4.5). Put another way, we want the number of values that fall between 8.1 – 0.45 = 7.65 and 8.1 + 0.45 = 8.55.

Looking at our 12 values, we can see that only one value, 7.51, falls outside of this range. If we have 12 total values and only 1 falls outside the range, then the other 11 are clearly within the range, so the answer is E. This question is discussed HERE.

As you can see, there’s very little math involved, even on the more difficult questions.

Takeaway: remember the axiom that the more complex-looking the formula is for a concept, the simpler the calculations are likely to be on the GMAT. An intuitive understanding of a topic will always go a lot further on this test than any amount of arithmetical virtuosity.
General Discussion
Math Expert
Joined: 02 Sep 2009
Posts: 92900
Own Kudos [?]: 618837 [14]
Given Kudos: 81588
Send PM
Re: Statistics Made Easy - All in One Topic! [#permalink]
4
Kudos
9
Bookmarks
Expert Reply
Dealing with Standard Deviation

BY KARISHMA, VERITAS PREP


In this post, we will work our way through the concepts of Standard Deviation (SD). Let’s take a look at how you calculate standard deviation first:



\(A_i\) – The numbers in the list

\(A_{avg}\) – Arithmetic mean of the list

\(n\) – Number of numbers in the list

Say you have 3 numbers : 11, 13 and 15. Their standard deviation is the “square root of the average of their squared deviations from the arithmetic mean.” Let’s see what we mean by this.

Mean of 11, 13 and 15 is 13.



Focus on these words: “deviations from mean”

The important point to note is that SD is a measure of dispersion or deviation from the mean (the mean is approximately the middle of the list if there are no outliers). In other words, SD is a measure of whether the numbers are very far away from the mean or close together. Since GMAT isn’t calculation intensive, you probably won’t need to calculate the actual SD in the test. The calculations are shown here only to illustrate the concept. But you must have a feel for how the numbers are distributed around the mean and what that implies for the SD.

Your statistics book explains how to visualize SD using the number line in detail, therefore, I am not going to delve deep into it but will quickly recap so that we can move ahead. Recall that if you plot the numbers on the number line, it gives you a sense of how far the numbers are from the mean. The farther the numbers, higher is the SD.

Let’s check out a few different cases to internalize the SD concept. Do not calculate anything in these questions. Just look at the number line for each case and figure out whether it makes sense to you.

Question: Which set, S or T, has higher SD?

Case 1: S = {3, 3, 3} or T = {0, 10, 20}

Case 2: S = {3, 4, 5} or T = {5, 6, 7}

Case 3: S = {3, 4, 5, 6} or T = {2, 3, 4, 5, 6, 7}

Case 4: S = {1, 3, 5} or T = {1, 1, 3, 5, 5}

Case 5: S = {1, 3, 5} or T = {1, 3, 3, 5}

Case 6: S = {6, 8, 10} or T = {12, 16, 20}

Case 7: S = {6, 8, 10} or T = {3, 4, 5}

Let me represent the first four cases on the number line. Check them out and then think which set should have the higher SD.



Let’s discuss each of these four cases now.

Case 1: S = {3, 3, 3} or T = {0, 10, 20}

T has higher SD. We will obtain the SD of T by calculating as shown in the example above. But we don’t really need to calculate it because we see that for set S, SD = 0. Each number is at the mean and hence has 0 deviation from the mean. Since SD cannot be negative, whatever the SD of T, it will be higher than the SD of S which is 0.

Case 2: S = {3, 4, 5} or T = {5, 6, 7}

Both sets have the same SD. We can see from the number line that they are equally dispersed around their respective means.

Case 3: S = {3, 4, 5, 6} or T = {2, 3, 4, 5, 6, 7}

Set T has higher SD. T has two extra numbers which are farther from the mean. Hence these 2 numbers will add to the total deviation. (There is a caveat here which we will discuss next week.)

Case 4: S = {1, 3, 5} or T = {1, 1, 3, 5, 5}

T has higher SD. It has two extra numbers far from the mean. (There is a caveat here too!)

What do you think about cases 5, 6, and 7? I will give you the answers to these three cases in the next post!

Attachment:
June4_2011_Image1.jpg
June4_2011_Image1.jpg [ 6.92 KiB | Viewed 241387 times ]

Attachment:
June4_2011_Image2.jpg
June4_2011_Image2.jpg [ 14.08 KiB | Viewed 242449 times ]

Attachment:
June4_2011_Image3.jpg
June4_2011_Image3.jpg [ 29.92 KiB | Viewed 242865 times ]
Math Expert
Joined: 02 Sep 2009
Posts: 92900
Own Kudos [?]: 618837 [11]
Given Kudos: 81588
Send PM
Re: Statistics Made Easy - All in One Topic! [#permalink]
8
Kudos
3
Bookmarks
Expert Reply
How to Quickly Solve Standard Deviation Questions on the GMAT

BY RON AWAD, VERITAS PREP


The quantitative section of the GMAT is designed to test your understanding and application of concepts you learned in high school. The exam focuses on core mathematical concepts such as algebra, geometry and statistics. However some concepts are more engrained in the high school curriculum than others. Everyone’s done addition, multiplication, subtraction and division, but sometimes figuring out factorials or square roots may be a little more unusual.

Perhaps no concept perplexes students on the GMAT more than the standard deviation. The standard deviation (often represented by σ) is measure of dispersion around the mean. It indicates how close the numbers in a set are to the set’s average. As a simple example, the sets {5, 10, 15} and {8, 10, 12} both have the same mean (10); however they do not have the same standard deviation.

Knowing how to calculate the standard deviation is not required on the GMAT, but knowing how it’s calculated gives you a tremendous edge in answering questions. It’s a four step process:

    1)      Find the average (mean) of the set.

    2)      Find the differences between each element of the set and that average.

    3)      Square all the differences and take the average of the differences. This gives you the variance.

    4)      Take the square root of the variance.

In this example, the average of the first set is clearly 10. The differences between the three elements are (-5, 0 and -5). Taking the square of these numbers, we get (25, 0 and 25). The average of these numbers is 50/3 or 16.67. The square root of this number will not be an integer, but it will be very close to 4. So we can assume roughly ~4 or ~4.1.

In contrast, the second set of numbers will have a much smaller standard deviation. The average is still 10, but the differences are now (-2, 0 and 2). Taking the square of these numbers, we get (4, 0 and 4). The average of these numbers is 8/3 or 2.67. The square root of 2.67 is roughly ~1.6 or ~1.7, but it’s very hard to pin down without a calculator or a lot of extra time.

This example should help highlight why the standard deviation is not explicitly calculated on an exam without a calculator: the chances of it being an integer are relatively low. However the concept it represents and the idea behind it are fair game on the test. One of the simple takeaways from the math behind the process is that, the farther the number is from the mean of the set, the more the standard deviation will increase. Specifically, the distance increases with the square of the difference, so 5 looks much farther out than 2.

This kind of concept can be tested on the exam, but if you know what you’re looking for, you can answer standard deviation questions very quickly. Let’s look at an example:

For the set {2, 2, 3, 3, 4, 4, 5, 5, x}, which of the following values of x will most increase the standard deviation?

(A)   1
(B)   2
(C)   3
(D)   4
(E)    5

If you recall the steps to calculating the standard deviation, what we really need to do first is to calculate the mean. (i.e. how mean are you?) You can add the eight elements together and divide by eight, but the fact that these elements follow a fairly obvious pattern helps us as well. The numbers each appear twice, and they are evenly spaced. This means that the average will be the same as the median, and the median is 3.5. Even if you take the long way, it shouldn’t take you more than 20 seconds to find that the mean of this set is 3.5

The next step is to take each element and find the difference from the mean, but this is what we need to do if the goal is to actually calculate the standard deviation. All we’re being tasked to do here is to determine which number will increase the standard deviation the most. In this regard, all we need to do is figure out which answer choice is furthest from the mean. That number will produce the biggest distance, which will then be squared and in turn produce the biggest difference in standard deviation. So although you can spend a lot of time calculating every last detail of this question, what it actually comes down to is “which of these numbers is furthest from 3.5”.

Asking about distance from a specific number is much more straightforward, and probably an elementary school level question. Yet, if you understand the concept, you can turn a GMAT question into something a 5th grader could answer (Are you smarter than a 5th grader?). The answer is thus obviously choice A, as 1 is as far from 3.5 as possible given only these five choices. This question is discussed HERE.

The important thing about the standard deviation is that you will never have to formally calculate it, but understanding the underlying concept will help you excel at the quantitative section of the GMAT. Most standard deviation questions hinge primarily on the distance from the mean, as everything else is just a rote division or addition. Much like taking five practice exams and getting wildly different scores, having a high variance is bad for knowing what to expect. Understanding the way standard deviations are tested on the GMAT will help you consistently get the questions right and reduce the variance of your results (hopefully with a very high mean).
Math Expert
Joined: 02 Sep 2009
Posts: 92900
Own Kudos [?]: 618837 [11]
Given Kudos: 81588
Send PM
Re: Statistics Made Easy - All in One Topic! [#permalink]
3
Kudos
8
Bookmarks
Expert Reply
A 750+ Level Question on SD

BY KARISHMA, VERITAS PREP


Above, we looked at a 750+ level question on mean, median and range concepts of Statistics. Here we have a 750+ level question on standard deviation concept of Statistics. We do hope you enjoy checking it out.

Before you begin, you might want to review the post that discusses standard deviation: Dealing With Standard Deviation

So here goes the question.

Question: Given that set S has four odd integers and their range is 4, how many distinct values can the standard deviation of S take?

(A) 3
(B) 4
(C) 5
(D) 6
(E) 7

Solution: Recall what standard deviation is. It measures the dispersion of all the elements from the mean. It doesn’t matter what the actual elements are and what the arithmetic mean is – the standard deviation of set {1, 3, 5} will be the same as the standard deviation of set {6, 8, 10} since in each set there are 3 elements such that one is at mean, one is 2 below the mean and one is 2 above the mean. So when we calculate the standard deviation, it will give us exactly the same value for both sets. Similarly, standard deviation of set {1, 3, 3, 5, 6} will be the same as standard deviation of {10, 12, 12, 14, 15} and so on. But note that the standard deviation of set {25, 27, 29, 29, 30} will be different because it represents a different arrangement on the number line.

Let’s look at the given question now.

Set S has four odd integers such that their range is 4. So it could look something like this {1, x, y, 5} when the elements are arranged in ascending order. Note that we have taken just one example of what set S could look like. There are innumerable other ways of representing it such as {3, x, y, 7} or {11, x, y, 15} etc.

Now in our example, x and y can take 3 different values: 1, 3 or 5

x and y could be same or different but x would always be smaller than or equal to y.

- If x and y were same, we could select the values of x and y in 3 different ways: both could be 1; both could be 3; both could be 5

- If x and y were different, we could select the values of x and y in 3C2 ways: x could be 1 and y could be 3; x could be 1 and y could be 5; x could be 3 and y could be 5.

For clarification, let’s enumerate the different ways in which we can write set S:

{1, 1, 1, 5}, {1, 3, 3, 5}, {1, 5, 5, 5}, {1, 1, 3, 5}, {1, 1, 5, 5}, {1, 3, 5, 5}

These are the 6 ways in which we can choose the numbers in our example.

Will all of them have unique standard deviations? Do all of them represent different distributions on the number line? Actually, no!

Standard deviations of {1, 1, 1, 5} and {1, 5, 5, 5} are the same. Why?

Standard deviation measures distance from mean. It has nothing to do with the actual value of mean and actual value of numbers. Note that the distribution of numbers on the number line is the same in both cases. The two sets are just mirror images of each other.

For the set {1, 1, 1, 5}, mean is 2. Three of the numbers are distance 1 away from mean and one number is distance 3 away from mean.

For the set {1, 5, 5, 5}, mean is 4. Three of the numbers are distance 1 away from mean and one number is distance 3 away from mean.

The deviations in both cases are the same -> 1, 1, 1 and 3. So when we square the deviations, add them up, divide by 4 and then find the square root, the figure we will get will be the same.

Similarly, {1, 1, 3, 5} and {1, 3, 5, 5} will have the same SD. Again, they are mirror images of each other on the number line.

The rest of the two sets: {1, 3, 3, 5} and {1, 1, 5, 5} will have distinct standard deviations since their distributions on the number line are unique.

In all, there are 4 different values that standard deviation can take in such a case.

Answer (B) This question is discussed HERE.
Math Expert
Joined: 02 Sep 2009
Posts: 92900
Own Kudos [?]: 618837 [18]
Given Kudos: 81588
Send PM
Re: Statistics Made Easy - All in One Topic! [#permalink]
2
Kudos
16
Bookmarks
Expert Reply
Math Expert
Joined: 02 Sep 2009
Posts: 92900
Own Kudos [?]: 618837 [9]
Given Kudos: 81588
Send PM
Re: Statistics Made Easy - All in One Topic! [#permalink]
2
Kudos
7
Bookmarks
Expert Reply
Using the Standard Deviation Formula on the GMAT

BY KARISHMA, VERITAS PREP


We have discussed standard deviation (SD) in detail above. We know what the formula is for finding the standard deviation of a set of numbers, but we also know that GMAT will not ask us to actually calculate the standard deviation because the calculations involved would be way too cumbersome. It is still a good idea to know this formula, though, as it will help us compare standard deviations across various sets – a concept we should know well.

Today, we will look at some GMAT questions that involve sets with similar standard deviations such that it is hard to tell which will have a higher SD without properly understanding the way it is calculated. Take a look at the following question:

Which of the following distribution of numbers has the greatest standard deviation?

(A) {-3, 1, 2}
(B) {-2, -1, 1, 2}
(C) {3, 5, 7}
(D) {-1, 2, 3, 4}
(E) {0, 2, 4}

At first glance, these sets all look very similar. If we try to plot them on a number line, we will see that they also have similar distributions, so it is hard to say which will have a higher SD than the others. Let’s quickly review their deviations from the arithmetic means:

For answer choice A, the mean = 0 and the deviations are 3, 1, 2
For answer choice B, the mean = 0 and the deviations are 2, 1, 1, 2
For answer choice C, the mean = 5 and the deviations are 2, 0, 2
For answer choice D, the mean = 2 and the deviations are 3, 0, 1, 2
For answer choice E, the mean = 2 and the deviations are 2, 0, 2

We don’t need to worry about the arithmetic means (they just help us calculate the deviation of each element from the mean); our focus should be on the deviations. The SD formula squares the individual deviations and then adds them, then the sum is divided by the number of elements and finally, we find the square root of the whole term. So if a deviation is greater, its square will be even greater and that will increase the SD.

If the deviation increases and the number of elements increases, too, then we cannot be sure what the final effect will be – an increased deviation increases the SD but an increase in the number of elements increases the denominator and hence, actually decreases the SD. The overall effect as to whether the SD increases or decreases will vary from case to case.

First, we should note that answers C and E have identical deviations and numbers of elements, hence, their SDs will be identical. This means the answer is certainly not C or E, since Problem Solving questions have a single correct answer.

Let’s move on to the other three options:

For answer choice A, the mean = 0 and the deviations are 3, 1, 2
For answer choice B, the mean = 0 and the deviations are 2, 1, 1, 2
For answer choice D, the mean = 2 and the deviations are 3, 0, 1, 2

Comparing answer choices A and D, we see that they both have the same deviations, but D has more elements. This means its denominator will be greater, and therefore, the SD of answer D is smaller than the SD of answer A. This leaves us with options A and B:

For answer choice A, the mean = 0 and the deviations are 3, 1, 2
For answer choice B, the mean = 0 and the deviations are 2, 1, 1, 2

Now notice that although two deviations of answers A and B are the same, answer choice A has a higher deviation of 3 but fewer elements than answer choice B. This means the SD of A will be higher than the SD of B, so the SD of A will be the highest. Hence, our answer must be A. This question is discussed HERE.

Let’s try another one:

Which of the following data sets has the third largest standard deviation?

(A) {1, 2, 3, 4, 5}
(B) {2, 3, 3, 3, 4}
(C) {2, 2, 2, 4, 5}
(D) {0, 2, 3, 4, 6}
(E) {-1, 1, 3, 5, 7}

How would you answer this question without calculating the SDs? We need to arrange the sets in increasing SD order. Upon careful examination, you will see that the number of elements in each set is the same, and the mean of each set is 3.

Deviations of answer choice A: 2, 1, 0, 1, 2
Deviations of answer choice B: 1, 0, 0, 0, 1 (lowest SD)
Deviations of answer choice C: 1, 1, 1, 1, 2
Deviations of answer choice D: 3, 1, 0, 1, 3
Deviations of answer choice E: 4, 2, 0, 2, 4 (highest SD)

Obviously, option B has the lowest SD (the deviations are the smallest) and option E has the highest SD (the deviations are the greatest). This means we can automatically rule these answers out, as they cannot have the third largest SD.

Deviations of answer choice A: 2, 1, 0, 1, 2
Deviations of answer choice C: 1, 1, 1, 1, 2
Deviations of answer choice D: 3, 1, 0, 1, 3

Out of these options, answer choice D has a higher SD than answer choice A, since it has higher deviations of two 3s (whereas A has deviations of two 2s). Also, C is more tightly packed than A, with four deviations of 1. If you are not sure why, consider this:

The square of deviations for C will be 1 + 1+ 1 + 1 + 4 = 8
The square of deviations for A will be 4 + 1 + 0 + 1 + 4 = 10

So, A will have a higher SD than C but a lower SD than D. Arranging from lowest to highest SD’s, we get: B, C, A, D, E. Answer choice A has the third highest SD, and therefore, A is our answer. This question is discussed HERE.

Although we didn’t need to calculate the actual SD, we used the concepts of the standard deviation formula to answer these questions.
Current Student
Joined: 04 Jun 2018
Posts: 142
Own Kudos [?]: 66 [4]
Given Kudos: 139
GMAT 1: 710 Q50 V36
GMAT 2: 690 Q50 V32
GMAT 3: 610 Q48 V25
Send PM
Re: Statistics Made Easy - All in One Topic! [#permalink]
4
Kudos
Bunuel wrote:
A 750 Level GMAT Question on Statistics!

BY KARISHMA, VERITAS PREP


In this post, we have a very interesting statistics question for you. Above, we have already discussed statistics concepts such as mean, median, range.

This question needs you to apply all these concepts but can still be easily done in under two minutes. Now, without further ado, let’s go on to the question – there is a lot to discuss there.

Question: An automated manufacturing unit employs N experts. Their average monthly salary is $7000 while the median monthly salary is only $5000. If the range of their monthly salaries is $10,000, what is the minimum value of N?

(A)10
(B)12
(C)14
(D)15
(E)20

Solution: Let’s first assimilate the information we have. We need to find the minimum number of experts that must be there. Why should there be a minimum number of people satisfying these statistics? Let’s try to understand that with some numbers.

Say, N cannot be 1 i.e. there cannot be a single expert in the unit because then you cannot have the range of $10,000. You need at least two people to have a range – the difference of their salaries would be the range in that case.

So there are at least 2 people – say one with salary 0 and the other with 10,000. No salary will lie outside this range.

Median is $5000 – i.e. when all salaries are listed in increasing order, the middle salary (or average of middle two) is $5000. With 2 people, one at 0 and the other at 10,000, the median will be the average of the two i.e. (0 + 10,000)/2 = $5000. Since there are at least 10 people, there is probably someone earning $5000. Let’s put in 5000 there for reference.

0 … 5000 … 10,000

Arithmetic mean of all the salaries is $7000. Now, mean of 0, 5000 and 10,000 is $5000, not $7000 so this means that we need to add some more people. We need to add them more toward 10,000 than toward 0 to get a higher mean. So we will try to get a mean of $7000.

Let’s use deviations from the mean method to find where we need to add more people.

0 is 7000 less than 7000 and 5000 is 2000 less than 7000 which means we have a total of $9000 less than 7000. On the other hand, 10,000 is 3000 more than 7000. The deviations on the two sides of mean do not balance out. To balance, we need to add two more people at a salary of $10,000 so that the total deviation on the right of 7000 is also $9000. Note that since we need the minimum number of experts, we should add new people at 10,000 so that they quickly make up the deficit in the deviation. If we add them at 8000 or 9000 etc, we will need to add more people to make up the deficit at the right.

Now we have

0 … 5000 … 10000, 10000, 10000

Now the mean is 7000 but note that the median has gone awry. It is 10,000 now instead of the 5000 that is required. So we will need to add more people at 5000 to bring the median back to 5000. But that will disturb our mean again! So when we add some people at 5000, we will need to add some at 10,000 too to keep the mean at 7000.

5000 is 2000 less than 7000 and 10,000 is 3000 more than 7000. We don’t want to disturb the total deviation from 7000. So every time we add 3 people at 5000 (which will be a total deviation of 6000 less than 7000), we will need to add 2 people at 10,000 (which will be a total deviation of 6000 more than 7000), to keep the mean at 7000 – this is the most important step. Ensure that you have understood this before moving ahead.

When we add 3 people at 5000 and 2 at 10,000, we are in effect adding an extra person at 5000 and hence it moves our median a bit to the left.

Let’s try one such set of addition:

0 … 5000, 5000, 5000, 5000 … 10000, 10000, 10000, 10000, 10000

The median is not $5000 yet. Let’s try one more set of addition.

0 … 5000, 5000, 5000, 5000, 5000, 5000, 5000 … 10000, 10000, 10000, 10000, 10000, 10000, 10000

The median now is $5000 and we have maintained the mean at $7000.

This gives us a total of 15 people.

Answer (D) This question is discussed HERE.

Granted, the question is tough but note that it uses very basic concepts and that is the hallmark of a good GMAT question!

Try to come up with some other methods of solving this.



The answer seems WRONG. The set of {5000,5000,5000,5000,5000,5000,7500,7500,10000,15000} solves this in N=10.
Intern
Intern
Joined: 11 Oct 2018
Posts: 19
Own Kudos [?]: 27 [4]
Given Kudos: 40
Location: Germany
Send PM
Statistics Made Easy - All in One Topic! [#permalink]
4
Kudos
nitesh50 wrote:
The answer seems WRONG. The set of {5000,5000,5000,5000,5000,5000,7500,7500,10000,15000} solves this in N=10.


That's true. Bunuel can you elaborate the approach to this result?
EDIT: I found the right question, to the given answer. The information missing is, that the average is not $7000, but $7000 more than the least salary.
Intern
Intern
Joined: 08 Jun 2019
Posts: 28
Own Kudos [?]: 12 [1]
Given Kudos: 28
Send PM
Re: Statistics Made Easy - All in One Topic! [#permalink]
1
Bookmarks
Hi, is there a PDF version of the content on this thread?
GMAT Club Bot
Re: Statistics Made Easy - All in One Topic! [#permalink]
 1   2   3   
Moderator:
Math Expert
92900 posts

Powered by phpBB © phpBB Group | Emoji artwork provided by EmojiOne