This note is based on Chapter 6

STT 200 Arnab Bhattacharjee This note is based on Chapter 6. Acknowledgement: Author is indebted to Dr. Ashoke Sinha, Dr. Jennifer Kaplan and Dr. Pa...
Author: Richard Lindsey
0 downloads 2 Views 720KB Size
STT 200 Arnab Bhattacharjee

This note is based on Chapter 6.

Acknowledgement: Author is indebted to Dr. Ashoke Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing him to use/edit many of their slides.

Comparison with unit-free measurement

Z-SCORES

How to compare apples with oranges? • A college admissions committee is looking at the files of two candidates, one with a total SAT score of 1500 and another with an ACT score of 22. Which candidate scored better? • How do we compare things when they are measured on different scales? • We need to standardize the values.

3

How to standardize? • Subtract mean from the value and then divide this difference by the standard deviation. • The standardized value = the z-score =



. .

• z-scores are free of units.

4

z-scores: An Example Data: 4, 3, 10, 12, 8, 9, 3 ( = 7 in this case) Mean = (4 + 3 + 10 + 12 + 8 + 9 + 3)/7 = 49/7 = 7. Standard Deviation = 3.65. Data 4 3 10 12 8 9 3

z-scores (4 – 7)/3.65 = −0.82 (3 – 7)/3.65 = −1.10 (10 – 7)/3.65 = 0.82 (12 – 7)/3.65 = 1.37 (8 – 7)/3.65 = 0.27 (9 – 7)/3.65 = 0.55 (3 – 7)/3.65 = −1.10

5

Interpretation of z-scores • The z-scores measure the distance of the data values from the mean in the standard deviation scale. • A z-score of 1 means that data value is 1 standard deviation above the mean. • A z-score of -1.2 means that data value is 1.2 standard deviations below the mean. • Regardless of the direction, the further a data value is from the mean, the more unusual it is. • A z-score of -1.3 is more unusual than a z-score of 1.2. 6

How to use z-scores? • A college admissions committee is looking at the files of two candidates, one with a total SAT score of 1500 and another with an ACT score of 22. Which candidate scored better? • SAT score mean = 1600, std dev = 500. • ACT score mean = 23, std dev = 6. • SAT score 1500 has z-score = (1500 − 1600)/500 = −0.2. • ACT score 22 has z-score = (22 − 23)/6 = −0.17. • ACT score 22 is better than SAT score 1500. 7

Which is more unusual? A. A 58 in tall woman z-score = (58 − 63.6)/2.5 = −2.24. B. A 64 in tall man z-score = (64 − 69)/2.8 = −1.79. C. They are the same.

Heights of adult women have  mean of 63.6 in.  std. dev. of 2.5 in.

Heights of adult men have  mean of 69.0 in.  std. dev. of 2.8 in.

8

Using z-scores to solve problems An example using height data and U.S. Marine and Army height requirements Question: Are the height restrictions set up by the U.S. Army and U.S. Marine more restrictive for men or women or are they roughly the same?

9

Data from a National Health Survey Heights of adult men have – mean of 69.0 in. – standard deviation of 2.8 in.

Heights of adult women have • mean of 63.6 in. • standard deviation of 2.5 in.

Height Restrictions Men Minimum U.S. Army U.S. Marine Corps

60 in 64 in

Women Minimum

58 in 58 in 10

Heights of adult men have – mean of 69.0 in. – standard deviation of 2.8 in.

Heights of adult women have • mean of 63.6 in. • standard deviation of 2.5 in.

Men Minimum U.S. Army

U.S. Marine

Women minimum

60 in

58 in

z-score = -3.21

z-score = -2.24

Less restrictive

More restrictive

64 in

58 in

z-score = -1.79

z-score = -2.24

More restrictive

Less restrictive

11

2004 Olympics Women’s Heptathlon

Carolina Kluft (Sweden) Shot Put = 14.77m, Long Jump = 6.78m.

Austra Skujyte (Lithunia) Shot Put = 16.40m, Long Jump = 6.30m.

Mean

Shot Put

Long Jump

13.29m

6.16m

1.24m

0.23m

28

26

(all contestants)

Std.Dev. 

12

Which performance was better? A. Skujyte’s shot put, z-score of Skujyte’s shot put = 2.51. B. Kluft’s long jump, z-score of Kluft’s long jump = 2.70. C. Both were same.

Mean

Shot Put

Long Jump

13.29m

6.16m

1.24m

0.23m

28

26

(all contestant)

Std.Dev. 

13

Based on shot put and long jump whose performance was better? A. Skujyte’s, z-score: shot put = 2.51, long jump = 0.61. Total z-score = (2.51+0.61) = 3.12. B. Kluft’s, z-score: shot put = 1.19, long jump = 2.70. Total z-score = (1.19+2.70) = 3.89. C. Both were same.

14

Bell shaped symmetric curve

NORMAL DISTRIBUTION

Effect of Standardization • Standardization into z-scores does not change the shape of the histogram. • Standardization into z-scores changes the center of the distribution by making the mean 0. • Standardization into z-scores changes the spread of the distribution by making the standard deviation 1.

16

The Normal Distribution

• In many data-sets, the histogram is symmetric, unimodal and bell-shaped. • These distributions are known as normal distribution and the data are said to be normally distributed.

17

The Histogram of z-scores

If data are normally distributed then •The histogram of z-scores is also symmetric, unimodal and bell-shaped. •We can approximate the histogram by a bellshaped curve called the normal curve.

18

68-95-99.7 (Empirical) Rule When data are bell shaped, the z-scores of the data values follow the empirical rule.

19

More on Normal Distribution

68-95-99.7 (Empirical) Rule tells us that if data are normally distributed, then almost all the datapoints are within plus minus 3 standard deviations from the mean.

20

Approximately what percent of U.S. women do you expect to be between 66 in and 67 in tall? Heights of adult women are normally distributed with • mean of 63.6 in, • standard deviation of 2.5 in. Use TI 83/84 Plus. • Press [2nd] & [VARS] (i.e. [DISTR]) • Select 2: normalcdf • Format of command: normalcdf(lower bound, upper bound, mean, std.dev.) For this problem: normalcdf(66, 67, 63.6, 2.5) = 0.0816. i.e. about 8.2% of adult U.S. women have heights between 66 in and 67 in.

21

Approximately what percent of U.S. women do you expect to be less than 64 in tall? Heights of adult women are normally distributed with • mean of 63.6 in, • standard deviation of 2.5 in.  Note that here upper bound is 64, but there is no mention of lower bound.  So take a very small value for lower bound, say -1000. For this problem normalcdf(-1000, 64, 63.6, 2.5) = 0.5636. i.e. about 56.4% of adult U.S. women have heights less than 64 in.

22

Approximately what percent of U.S. women do you expect to be more than 58 in tall? Heights of adult women are normally distributed with • mean of 63.6 in, • standard deviation of 2.5 in.  Note that here lower bound is 58, but there is no mention of upper bound.  So take a very high value for upper bound, say 1000. For this problem normalcdf(58, 1000, 63.6, 2.5) = 0.987. i.e. 98.7% of adult U.S. women have heights more than 58 in.

23

What about men’s height? Heights of adult men are normally distributed with • mean of 69 in, • standard deviation of 2.8 in. o normalcdf(60, 1000, 69, 2.8) = 0.999. Hence 99.9% adult male will have height more than 60 in. o normalcdf(64, 1000, 69, 2.8) = 0.963. So 96.3% adult male will have height more than 64 in. Thus for U.S. Army height restriction for women is more restrictive compared to men. But for U.S. Marine height restriction for men is more restrictive compared to women.

24

Below what height 80% of U.S. men do have their heights? Heights of adult men are normally distributed with • mean of 69 in, • standard deviation of 2.8 in. The question is to find the height x such that {Percent of men’s height < x} = 80% = 0.8. Use TI 83/84 Plus. • Press [2nd] & [VARS] (i.e. [DISTR]) • Select 3: invNorm • Format of command: invNorm(fraction, mean, std.dev.) For this problem: invNorm(0.8, 69, 2.8) = 71.36. i.e. 80% of U.S. men have heights less than 71.36 in.

25

Remark: invNorm • invNorm only considers percentage or fraction in the lower tail of normal distribution. • For example, suppose the question is “Above what height 10% of U.S. men do have their heights?” Notice here the question is find the height ! such that {Percent of men’s height > !} = 10% = 0.1. This means {Percent of men’s height < !} = (100 − 10)% = 90% = 0.9. For this problem: invNorm(0.9, 69, 2.8) = 72.59. i.e. 90% of U.S. men have heights less than 72.59 in, i.e. 10% of U.S. men have heights more than 72.59 in.

26

Suggest Documents