Bunuel
Gift
12 Days of Christmas Competition
This question is part of our holiday event
Win $40,000 in prizes: courses, tests, and more
A quality inspector recorded the fill volumes, in milliliters, of six bottles from a production run, with the following results:
{24, 38, 53, 23, 56, 34}
The inspector believes exactly one of the six measurements is erroneous and will discard that measurement. To evaluate the impact of the discard, he will calculate the standard deviation of the original six measurements and the standard deviation of the remaining five measurements.
Select for
Most decrease the measurement which, if discarded, would produce the greatest decrease in standard deviation, and select for
Most increase the measurement which, if discarded, would produce the greatest increase in standard deviation. Make only two selections, one in each column.
The six bottle reading volumes in milli litres are {23, 24, 34, 38, 53, 56}.
The sum = 228
Mean = 228/6 = 38
Standard Deviation (S.D) is the measure of how spread the data is, with respect to mean.
Case 1: Now we need remove a data, which makes the SD increase , SD decrease also.
So, if we remove a value, which is closest or equal to mean. The spread increases.
Then, removing 38, which is the mean, will result in the mean =38.
But, the values around the mean is diminished, there by pushing the spread (S.D).
S.D INCREASE most happens at value 38. Case 2: When we remove the data which is at the farthest distance from the mean (38) = 56.
This pushes the values towards the mean. New mean = 34.4.
The values are pulled towards the mean, with the left side having more spread than the right side. The skewness is more towards the left hand side.
Thus, the
S.D DECREASES most at 56.
Most S.D increase = 38
Most S.D decrease = 56
For general understanding :
Imagine a normal bell shaped curve, or a tent in bell shape, which has a long pole at the centre (mean).
Values closer or equal to mean: When you
add more data at the mean position or closer to the mean, the clustering of data happens at that point. This increases the height of the curve, thereby pulling the outlier data towards the mean. So, the
spread decreases.
When you
remove values which are equal to mean or closer to mean, assume u remove all structures which is holding the curve as bell shaped near the mean. This will result in data falling, and the spread moving across in both directions, thus
spread increases.
Values away from mean: When you
add more data away from mean, your stretching the curve beyond its actual shape, so the
spread increases.
When you
remove data away from mean, your releasing the tension held by the values, so that it shrinks to it original shape. So,
spread decreases.