Chapter 7. Section 7.1

Chapter 7 Section 7.1 Check Your Understanding, page 417: 1. The parameter is µ = 20 ounces of iced tea. The statistic is x = 19.6 ounces of iced tea....
Author: Arnold Cannon
23 downloads 1 Views 491KB Size
Chapter 7 Section 7.1 Check Your Understanding, page 417: 1. The parameter is µ = 20 ounces of iced tea. The statistic is x = 19.6 ounces of iced tea. 2. The parameter is p = 0.10, or 10% of passengers. The statistic is pˆ = 0.08, or 8% of the sample of passengers.

Check Your Understanding, page 420:

1. The individuals are the M&M’S® Milk Chocolate Candies, the variable is the color of the M&M and the parameter of interest is the proportion of orange M&M’s.

2. For this sample there are 11 orange M&M’s so = pˆ

11 = 0.22. 50

3. The middle graph is the approximate sampling distribution of pˆ . The statistic measures the proportion of oranges in samples of M&M’s. Assuming that the company is correct, 20% of the M&M’s are orange, so the center of the distribution of pˆ should be at approximately 0.20. The first graph shows the distribution of the colors for one sample, rather than the distribution of pˆ from many samples, and the third graph is centered at the wrong spot.

Check Your Understanding, page 426:

1. The median does not appear to be an unbiased estimator of the population median. The mean of the 500 sample medians is 73.5 whereas the median of the population is 75. 2. With larger samples, the spread of the sampling distribution is smaller, so increasing the sample size from 10 to 20 will decrease the spread of the sampling distribution.

156

The Practice of Statistics for AP*, 4/e

3. The sampling distribution is skewed to the left. This means that, in general, underestimates of the population median will be greater than overestimates. Exercises, page 428: 7.1 Population: people who signed a card saying that they intend to quit smoking. Parameter of interest: proportion of the population who signed the card saying they would not smoke who actually quit smoking. Sample: a random sample of 1000 people who signed the cards. Sample statistic: pˆ = 0.21. 7.2 Population: individuals in the US. Parameter of interest: proportion of the U. S. population who were unemployed. Sample: a random sample of individuals from 55,000 U.S. households. Sample statistic: pˆ = 0.100. 7.3 Population: all the turkey meat. Parameter of interest: minimum temperature. Sample: 4 randomly chosen points in the turkey. Sample statistic: sample minimum = 170 F. 7.4 Population: all gasoline stations in a large city. Parameter of interest: range of gas prices at the gasoline stations in the city. Sample: a random sample of 10 gas stations in the city. Sample statistic: sample range = 25 cents. 7.5 µ = 2.5003 is a parameter (related to the population of all the ball bearings in the container) and x =2.5009 is a statistic (related to the sample of 100 ball bearings). 7.6 p = 0.41 is a parameter (related to the population of all registered voters) and pˆ = 0.33 is a statistic (related to the sample of 250 registered voters). 7.7 pˆ = 0.48 is a statistic (related to the sample of 100 numbers dialed) and p = 0.52 is a parameter (related to the population of all residential phone numbers in Los Angeles). 7.8 x = 64.5 inches is a statistic (related to the sample of female college students) and µ = 63 inches is a parameter (related to the population of all adult American women). 7.9 (a) This is not the exact sampling distribution because that would require a value of pˆ for all possible samples of size 100. However, it is an approximation of the sampling distribution that we created through simulation. (b) The distribution is centered at 0.60 and is reasonably symmetric and bellshaped. Values vary from about 0.47 to 0.74. The values at 0.47, 0.73 and 0.74 are outliers. (c) If we found that only 45 students said that they did all their homework last week, we would be skeptical of the newspaper’s claim that 60% of students did their homework last week. None of the simulated samples had a proportion this low. 7.10 (a) This is not the exact sampling distribution because that would require a value of x for all possible samples of size 20. However, it is an approximation of the sampling distribution that we created through simulation. (b) The distribution is centered at 64 and is reasonably symmetric and bell-shaped. Values vary from about 62.4 to 65.7. There do not seem to be any outliers. (c) If we found that the sample mean was 64.7 inches, we would likely conclude that this population mean height for females at this school could be 64. In our simulation we found values of 64.7 or larger in about 10% of the samples.

Chapter 7: Sampling Distributions

157

7.11 (a)

(b) Answers will vary. An example bar graph is given.

7.12 (a)

158

The Practice of Statistics for AP*, 4/e

(b) Answers will vary. An example histogram is given below.

( )

2

7.13 (a) The approximate sampling distribution is skewed to the right with a center at 9  F . The values vary from about 2 to 27.5

( F) . 

2

(b) A sample variance of 25 is quite large compared with what

we would expect, since only one out of 500 SRSs had a variance that high. It suggests that the manufacturer’s claim is false and that the thermostat actually has more variability than claimed. 7.14 (a) The approximate sampling distribution is reasonably symmetric and centered at 45.5 F. The values vary from about 39 to 50 F. (b) A sample minimum of 40 F is quite low compared with what we would expect. This suggests that the manufacturer’s claim is false. 7.15 (a) The population is the 12,000 students; the population distribution (Normal with mean 7.11 minutes and standard deviation 0.74 minutes) describes the time it takes randomly selected individuals to run a mile. (b) The sampling distribution (Normal with mean of 7.11 minutes and standard deviation of 0.074 minutes) describes the average mile-time for 100 randomly selected students. This is different from the population distribution in that it has a smaller standard deviation and it describes the mean of 100 mile times rather than individual mile times. 7.16 (a) The population is the 4000 beads in the container. 1000 of the beads are white and 3000 are red. (b) The distribution of the sample proportion is approximately Normal with mean 0.75 and standard deviation 0.06. The sample proportion is a numerical variable and so its distribution could be shown using a histogram or dotplot. The color of the individual beads is a categorical variable and so its distribution would be best shown with a bar graph. 7.17 (a) Since the smallest number of total tax returns (i.e., the smallest population) is still more than 10 times the sample size, the variability of the sample proportion will be (approximately) the same for all states. (b) Yes. It will change—the sample taken from Wyoming will be about the same size, but the sample in, for example, California will be considerably larger, and therefore the variability of the sample proportion will be smaller. 7.18 (a) A larger sample does not reduce the bias of a poll result. If the sampling technique results in bias, simply increasing the sample size will not reduce the bias. (b) A larger sample will reduce the variability of the result. More people means more information which means less variability. 7.19 (a) Graph (c) shows an unbiased estimator because the mean of the distribution is very close to the population parameter. (b) The graph in (b) shows the statistic that does the best job at estimating the parameter. Although it is biased, the bias is small and the statistic has very little variability.

Chapter 7: Sampling Distributions

159

7.20 (a) If we choose many samples, the average of the x -values from these samples will be close to µ . In other words, the sampling distribution of x is centered at the population mean µ we are trying to estimate. (b) A larger sample will give more information and, therefore, more precise results. The variability in the distribution of the sample average decreases as the sample size increases. 7.21 d. 7.22 e. 7.23 c. 7.24 b. 7.25 (a) This is the same thing as asking what percent of Normal scores are more than 2.5 standard deviations below the mean. In other words, what is P ( z < −2.5 ) ? Using Table A, this value is 0.0062. (b) The distribution for the older women, based on the standard scale for younger women, is Normal with mean -2 and standard deviation 1. So the question is asking for the probability of getting a standard score  −2.5 − ( −2 )  of less than -2.5. This is P ( X < −2.5 = = = ) P  z < ) 0.3085. So, based on this  P ( z < −0.5 1   criterion, about 31% of women aged 70-79 have osteoporosis. 7.26 (a) The equation for the least squares regression line = is: yˆ 1.4146 + 0.4399 x where yˆ is the predicted average number of offspring per female and x is the index of the abundance of pine cones. (b) The points in the residual plot are well scattered so this tells us that the linear model is appropriate for the data. (c) 57.2% of the variation in the average number of offspring can be accounted for by the linear model relating average number of offspring to abundance of pine cones, and we expect individual predictions to be off by an average of about 0.6.

Section 7.2 Check Your Understanding, page 437:

1. The mean of the sampling distribution is the same thing as the population proportion. In this case µ pˆ = 0.75. p (1 − p ) 0.75 ( 0.25 ) = = 0.0137. n 1000 There are more than 10,000 young adult internet users, so the 10% condition has been met. 3. The sampling distribution of pˆ is approximately Normal.= Both np 1000 = ( 0.75) 750 and

2. The standard deviation of the sampling distribution is: σ pˆ =

n= (1 − p ) 1000= ( 0.25) 250 are greater than 10. 4. If the sample size were 9000 instead of 1000, the sampling distribution would still be approximately Normal with mean 0.75. But the standard deviation of the sampling distribution would be smaller by a p (1 − p ) 0.75 ( 0.25 ) factor of 3. In this case σ pˆ = = = 0.0046. n 9000

160

The Practice of Statistics for AP*, 4/e

Exercises, page 439:

7.27 (a) We would not be surprised to find 8 (32%) orange candies. From the graph in figure 7.11 there were a fair number of simulations in which there were 8 or fewer orange candies. On the other hand, there were only a couple of simulations where there were 5 (20%) or fewer so if this occurred, we should be surprised. (b) It is more surprising to get 32% orange candies in a sample of 50 than it is in a sample of 25. Comparing the graphs in figures 7.11 and 7.12, there were a fair number of simulations in 7.11 (sample size 25) with 32% or less, but very few in 7.12 (sample size 50) with 32% or less. 7.28 (a) We would be surprised to find 32% orange candies in this case. Very few of the simulations with sample size 25 had 32% or more orange candies. However, we would not be surprised to find 20% orange candies. This is very near the center of the distribution. (b) We would be surprised to find 32% orange candies in either case since neither simulation had many samples with 32% or more orange candies. However, it is even rarer when the sample size is 50. 7.29 (a) The mean of the sampling distribution is the same as the population proportion so µ pˆ= p= 0.45. (b) The standard deviation of the sampling distribution is

= σ pˆ

p (1 − p ) = n

0.45(0.55) = 0.0995. In this case the 10% condition is met because it is very 25

likely true that there are more than 250 candies. (c) The sampling distribution is approximately Normal because = np 25 = − p ) 25 ( 0.55 = ( 0.45) 11.25 and n (1 = ) 13.75 are both at least 10. (d) If the sample size were 50 rather than 25, the sampling distribution would still be approximately Normal with mean 0.45, p (1 − p ) 0.45 ( 0.55 ) but the standard deviation would be σ pˆ = = = 0.0704. n 50 7.30 (a) The mean of the sampling distribution is the same as the population proportion so µ pˆ= p= 0.15. (b) The standard deviation of the sampling distribution is 0.15 ( 0.85 ) p (1 − p ) = = 0.0714. In this case the 10% condition is met because it is very likely 25 n true that there are more than 250 candies. (c) The sampling distribution is not approximately Normal because = np 25 = − p ) 25 ( 0.85 = ( 0.15) 3.75 is less than 10. Note that n (1 = ) 21.25 is at least 10 but for the Normal approximation to be correct, both numbers must be at least 10. (d) If the sample size were 75 rather than 25, the sampling distribution would now be approximately Normal with mean 0.15 and

σ pˆ =

0.15 ( 0.85 ) p (1 − p ) = np 75 = = = 0.0412, since ( 0.15) 11.25 and 75 n n (1 = − p ) 75 ( 0.85 = ) 63.75 are both at least 10.

standard deviation σ pˆ =

7.31 (a) The 10% condition is not met here. Out of the population of 76 passengers, 10 people were screened (13%). This means that they sampled more than 10% of the population. (b) No. The Normal condition is also not met since the total sample size was 10. Necessarily, both np and n (1 − p ) will be less than 10, violating the condition for Normality. 7.32 (a) The 10% condition is met here. We are drawing a sample of 7 out of 100 tiles. This is less than 10% of the population. (b) The Normal condition is not met here since the total sample size was 7. Necessarily, both np and n (1 − p ) will be less than 10, violating the condition for Normality.

Chapter 7: Sampling Distributions

161

7.33 The Normal condition is not met here. np = 15 ( 0.3 = ) 4.5 < 10. Challenge: Let X be the number of Hispanic workers in the sample. X has an approximate binomial distribution with n = 15 and p = 0.3.

15  15  0 15 3 12  ( 0.3) (1 − 0.3) +  +   ( 0.3) (1 − 0.3) = 0.2969. 0 3

P(X ≤ 3) = 

7.34 The 10% condition is not met here. The sample of 50 is more than 10% of the population (which is of size 316). 7.35 (a) The mean of the sampling distribution is the same as the population proportion so it is µ pˆ= p= 0.70. (b) The standard deviation of the sampling distribution is

= σ pˆ

p (1 − p ) = n

0.7(0.3) = 0.0144. The population (all U.S. adults) is clearly at least 10 times as 1012

large as the sample (the 1012 surveyed adults) so the 10% condition is met. (c) The sampling distribution is approximately Normal= since np 1012 = ( 0.70 ) 708.4 and

n= = = (1 − p ) 1012= ( 0.30 ) 303.6 are both at least 10. (d) P ( pˆ ≤ 0.67 ) P ( z ≤ −2.08 ) 0.0188. This is a fairly unusual result if 70% of the population actually drink the cereal milk.

7.36 (a) The mean is the same as the population proportion so it is µ pˆ= p= deviation is σ pˆ =

0.4. (b) The standard

0.4 ( 0.6 )

= 0.0116 . Since the population is clearly at least 10 times bigger than the 1785 sample, the 10% condition is met. (c) The sampling distribution is approximately Normal since = np 1785 = − p ) 1785= ( 0.4 ) 714 and n (1= ( 0.6 ) 1071 are both at least 10. (d)

0.44 − 0.4   P ( pˆ ≥ 0.44 ) =P  z ≥  =P ( z ≥ 3.45 ) =0.0003. It is very unlikely, if the true proportion of 0.0116   people who attend church or synagogue is 0.40, that 44% or more will answer the poll saying that they attend church or synagogue. 7.37 Since the standard deviation is found by deviding by standard deviation

(

)

4n = 2 n ; we would need to sample 1012 ( 4 ) = 4048 adults.

7.38 Since the standard deviation is found by deviding by standard deviation

(

n , using 4n for the sample size halves the

)

n , using 9n for the sample size halves the

9n = 3 n ; we would need to sample 9 (1785 ) = 16,065 adults.

7.39 State: We want to find the probability that pˆ is at least 0.75. In symbols, that’s P ( pˆ ≥ 0.75 ) . Plan: We have an SRS of size 267 drawn from a population in which the proportion p = 0.70 of college women have been on a diet within the past 12 months. This means that µ pˆ = 0.70 and since the 0.7 ( 0.3) = 0.0280. We 267 also check the Normal condition: = np 267 = − p ) 267 (= 0.3) 80.1 are both at least ( 0.7 ) 186.9 and n (1 =

population clearly contains more than 267 (10 ) = 2670 college women, = σ pˆ

162

The Practice of Statistics for AP*, 4/e

10, so the distribution of pˆ can be approximated by a Normal distribution. Do:

0.75 − 0.7   P ( pˆ ≥ 0.75 ) =P  z ≥  =P ( z ≥ 1.79 ) =0.0367. Conclude: About 3.67% of all SRSs of size 0.0280   267 will give a sample proportion that is 0.75 or greater. 7.40 State: We want to find the probability that pˆ is at least 0.20. In symbols, that’s P ( pˆ ≥ 0.20 ) . Plan: We will have an SRS of size 500 drawn from a population in which the proportion p = 0.14 of motorcycle owners own Harleys. This means that µ pˆ = 0.14 and since the population clearly contains more than 500 (10 ) = 5000 motorcycle owners, σ pˆ =

0.14(0.86) . We also check the Normal 500

condition: = np 500 = − p ) 500 (= 0.86 ) 430 are both at least 10, so the distribution ( 0.14 ) 70 and n (1= of pˆ can be approximated by a Normal distribution. Do:

0.20 − 0.14   P ( pˆ ≥ 0.20 ) =P  z ≥  =P ( z ≥ 3.87 ) ≈ 0. Conclude: While it is possible, it is extremely 0.0155   unlikely that we would get a sample of 500 motorcycle owners in which at least 20% own Harleys. 7.41 (a) State: We want to find the probability that pˆ is at most 0.86. In symbols, that’s P ( pˆ ≤ 0.86 ) . Plan: We have an SRS of size 100 drawn from a population in which p = 0.90 of orders are shipped within three working days. This means that µ pˆ = 0.90 and since the population contains more than 100 (10 ) = 1000 orders in the last week (we are told there were 5000), σ pˆ =

0.9 ( 0.1) = 0.03. We also 100

check the Normal condition: = np 100 = p ) 100 (= 0.1) 10 are both at least 10, so the ( 0.9 ) 90 and n (1 −= distribution of pˆ can be approximated by a Normal distribution. Do:

0.86 − 0.9  = = = P ( pˆ ≤ 0.86 ) P  z ≤ ) 0.0918. Conclude: There is a 9.18% chance that we  P ( z ≤ −1.33 0.03   would get a sample in which 86% or fewer of the orders were shipped within three working days. (b) If the claim is correct, then we can expect to observe 86% or fewer orders shipped on time in about 9.18% of the samples of this size. Getting a sample proportion at or below 0.86 is not an unlikely event. 7.42 (a) State: We want to find the probability that pˆ is at most 0.62. In symbols, that’s P ( pˆ ≤ 0.62 ) . Plan: We have an SRS of size 100 drawn from a population in which p = 0.67 of students support efforts to crack down on underage drinking. This means that µ pˆ = 0.67 and since the population contains more 0.67 ( 0.33) = 0.0470. We also check the Normal condition: 100 = np 100 = − p ) 100 (= 0.33) 33 are both at least 10, so the distribution of pˆ can be ( 0.67 ) 67 and n (1= approximated by a Normal distribution. Do: 0.62 − 0.67  = = = P ( pˆ ≤ 0.62 ) P  z ≤ ) 0.1446. Conclude: There is a 14.46% chance that  P ( z ≤ −1.06 0.0470   we would get a sample in which 62% or fewer of the students supported efforts to crack down on underage drinking. (b) Getting a sample proportion at or below 0.62 is not an unlikely event. The

than 100 (10 ) = 1000 students, = σ pˆ

Chapter 7: Sampling Distributions

163

sample results are lower than the national percentage, but the sample was so small that such a difference could arise by chance even if the true campus proportion is the same. 7.43 b. 7.44 c. 7.45 b. 7.46 b. 7.47

62% neither download nor share music files. 7.48 (a) Assign numbers 01-14 to the animals (01 to the desert tortoise, 02 to the Olive Ridley sea turtle,…, 14 to the San Francisco garter snake). Starting at line 111 in Table D, read pairs of numbers until you get three different numbers between 01 and 14. These numbers represent the animals chosen. (b) Using Table D, the animals chosen are 12, 04, and 11 which represent the blunt-nosed leopard lizard, the flat-tailed horned lizard and the Coachella Valley fringe-toed lizard.

Section 7.3 Check Your Understanding, page 448:

270 − 266   1. P ( X > 270 ) =P  z >  =P ( z > 0.25 ) =0.4013 16   2. The mean of the sampling distribution of x is the same as the mean of the distribution of X so µ= µ= 266 days. x X 3. First, we check the 10% condition. We are taking a sample of 6 pregnant women. There are clearly more than 10 ( 6 ) = 60 pregnant women so this condition is met. Therefore, the standard deviation of the

σX

16 = = 6.532 days. n 6 270 − 266   4. P ( x > 270 ) =P  z >  =P ( z > 0.61) =0.2709 6.532  

sampling distribution is σ = x

Exercises, page 454:

7.49 The mean is µ= µ= 225 seconds, and the standard deviation is x X 164

The Practice of Statistics for AP*, 4/e

σX

60 = = 18.974 seconds. These results do not depend on the shape of the distribution of the 10 n individual play times.

σ = x

σX

0.002 = = 0.001 mm. 4 n These results do not depend on the shape of the distribution of the individual axele diameters.

7.50 The mean is µ= = µ= 40.125 mm, and the standard deviation is σ x x X

7.51 If we want σ x = 30, then we need to solve the following equation for n: 60 60 30 = → n= = 2 → n = 4. So we need a sample of size 4. 30 n 7.52 If we want σ x = 0.0005, then we need to solve the following equation for n: 0.002 0.002 0.0005 = → n= = 4 → n = 16. So we need a sample of size 16. 0.0005 n 7.53 (a) The sampling distribution of x is Normal with µ= 188 mg/dl and µ= x X σX 41 σ = = = 4.1 mg/dl. x 100 n 191 − 188   185 − 188 ≤z≤ (b) P (185 ≤ x ≤ 191) = P   = P ( −0.73 ≤ z ≤ 0.73) = 0.7673 − 0.2327 = 0.5346 4.1   4.1 σX 41 (c) In this case σ = = = 1.30 mg/dl. So now the probability becomes x 1000 n 191 − 188   185 − 188 ≤z≤ P (185 ≤ x ≤ 191) = P   = P ( −2.31 ≤ z ≤ 2.31) = 0.9896 − 0.0104 = 0.9792. The 1.30 1.30   larger sample is better since it is more likely to produce a sample mean within 3 mg/dl of the population mean. 7.54 (a) The sampling distribution of x is Normal with µ= µ= 55,000 miles and x X

σX

51,800 − 55,000  4500 = = = = = 1591 miles. (b) P ( x < 51,800 ) 0.0222 ) P  z <  P ( z < −2.01 1591 8 n   Getting a sample mean this low would be a surprising result if the company’s claim was true. Thus, I would doubt the company’s claim.

σ = x

295 − 298  = = 7.55 (a) Let X denote the amount of cola in a bottle. P ( X < 295 ) P  z <  P ( z < −1) = 3   0.1587. (b) If x is the mean contents of six bottles (assumed to be independent), then x has a Normal σX 3 distribution with µ= = = = 1.2247 ml (10% condition OK since there are µ= 298 ml and σ x x X 6 n 295 − 298  = = = more than 60 bottles in the population). P ( x < 295 ) 0.0071. ) P  z <  P ( z < −2.45 1.2247  

Chapter 7: Sampling Distributions

165

7.56 (a) Let X denote the ACT score of a randomly selected test taker. 23 − 21.1   P ( X > 23) = P  z >  = P ( z > 0.37 ) = 0.3557 (b) For a sample of size 50, the sampling 5.1   5.1 distribution is Normal with mean µ x = 21.1 and standard deviation= σ x = 0.7212. (10% condition 50 OK since there are more than 500 ACT test takers). 23 − 21.1   P ( x > 23) = P  z >  = P ( z > 2.63) = 0.0043. 0.7212   7.57 No. The histogram of the sample values will look like the population distribution, whatever it might happen to be. The central limit theorem says that the histogram of the distribution of sample means (from many large samples) will look more and more Normal. 7.58 When the sample size is small, the sampling distribution is still skewed ot the right, but less so than the population. As the sample size n gets larger, the sampling distribution of the sample mean will more closely follow a Normal distribution,. 7.59 (a) Since the distribution of the play times of the population of songs is heavily skewed to the right, a sample size of 10 will not be enough for the Normal approximation to be appropriate. (b) With a sample size of 36, we now have enough observations in our sample for the Central Limit Theorem to   240 − 225   apply. P ( x > 240 ) = P  z >  = P ( z > 1.5 ) = 0.0668. 60   36   7.60 (a) If x is the mean number of strikes per square kilometer, then µ x = 6 strikes/km2 and 2.4 = 0.7589 strikes/km2. (b) We cannot calculate the probability, because we do not know the 10 shape of the distribution of the number of lightning strikes. If we were told that the population is Normal, then we would be able to compute the probability. (c) With a sample size of 50, the Central Limit Theorem assures us that the Normal approximation is valid for the sampling distribution of x .   5−6   P ( x 6000 ) =P ( x > 200 ) =P  z >  =P ( z > 1.56 ) =0.0594. 6.3901   There is about a 6% chance that the total weight exceeds the limit of 6000 lb. 7.62 (a) No. A count only takes on whole-number values, so it cannot be normally distributed.

166

The Practice of Statistics for AP*, 4/e

(b) The approximate distribution of x is Normal with mean µ x =1.5 people and standard deviation 0.75 σ x = 0.0283 . Thus, = 700 1075  1.5357 − 1.5    P ( X > 1075) =P  x >  =P ( x > 1.5357 ) =P  z >  =P ( z > 1.26 ) 700  0.0283    = 0.1038. 7.63 State: What is the probability that the average loss will be no greater than $275? Plan: The sampling distribution of the sample mean loss x has mean µ= $250 and standard deviation µ= x X 300 = σ x = $3. (10% condition is met assuming at least 100,000 policies). Since the sample size is 10,000 so large (10,000>30) we can safely use the Normal distribution as an approximation for the sampling 275 − 250   distribution of x . Do: P ( x > 275 ) =P  z >  =P ( z > 8.33) ≈ 0. Conclude: It is very, very 3   unlikely that the company would have an average loss of more than $275. 7.64 State: What is the probability that the mean number of flaws per square yard of carpet is more than 2? Plan: The sampling distribution of the sample mean number of flaws per square yard has mean 1.2 σ x = 0.085. (10% condition OK since there are more than 2000 µ x = 1.6 and standard deviation= 200 square yards of material). Since the sample size is so large (200 > 30) we can safely use the Normal distribution as an approximation for the sampling distribution of x . Do: 2 − 1.6   P ( x > 2 )= P  z > = P ( z > 4.71) ≈ 0. Conclude: There is virtually no chance that the average 0.085   number of flaws per yard in the sample will be found to be greater than 2. 7.65 a. 7.66 c. 7.67 b. 7.68 d.

Chapter 7: Sampling Distributions

167

7.69 The unemployment rates for each level of education are: 408 ) (12, 470 − 11,= P ( unemployed didn't finish = HS) 12, 470 1977 P ( unemployed HS but no college = = 0.0523 ) 37,834 1462 P ( unemployed less than bachelor's degree = = ) 34, 439 1097 P ( unemployed college graduate = = 0.0272 ) 40,390 The unemployment rate decreases with additional education

force ) 7.70 P ( in labor=

1062 = 0.0852 12, 470

0.0425

12, 470 + 37,834 + 34, 439 + 40,390 125,133 = = 0.6704 27,669 + 59,860 + 47,556 + 51,582 186,667

= 7.71 P ( in labor force college graduate )

40,390 = 0.7830 51,582

7.72 The events “in the labor force” and “college graduate” are not independent, since the probability of being in the labor force (0.6704) does not equal the probability of being in the labor force given that the person is a college graduate (0.7830).

Chapter Review Exercises (page 458) R7.1 The population is the set of all eggs shipped on the day in question. The sample consists of the 200 eggs examined. The parameter is the proportion p = 0.001 of eggs shipped that day that had salmonella. 9 pˆ = 0.045 of eggs in the sample that had salmonella. The statistic is the sample proportion= 200 R7.2 (a) Answers will vary. An example dotplot is given.

(b) The dotplot does not show the range of every possible sample of size 5 from the population. Instead it shows the ranges from 500 SRSs from the population. This is a very small subset of the values that make up the sampling distribution. R7.3 (a) The sample range is not an unbiased estimator of the population range. If it were unbiased, then the sampling distribution would have 3417 (the actual range) as its mean. But, according to the graph, none of the 500 observations were greater than 3000 (and certainly none were greater than 3417). The mean in the graph is closer to 1200 which means that the sample range underestimates the value of the 168

The Practice of Statistics for AP*, 4/e

population range. (b) If we want to reduce the variability of the sampling distribution of the sample range, we should take larger samples. R7.4 (a) The mean is µ pˆ= p=

0.15. (b) Since the population (all adults) is considerably larger than 10

0.15 ( 0.85 ) = 0.0091. (c) Since 1540 = np 1540 = ( 0.15) 231 and n = (1 − p ) 1540= ( 0.85) 1309 are both at least 10, the sampling distribution is

times the sample size (n = 1540), the standard deviation is σ pˆ =

approximately Normal. (d) P ( 0.13 ≤ pˆ ≤ 0.17 ) =

0.17 − 0.15   0.13 − 0.15 P ≤z≤  = P ( −2.20 ≤ z ≤ 2.20 ) = 0.9722 0.0091   0.0091 R7.5 (a) We are looking for the probability that, in random sample of 100 travelers, 20 or fewer get a red light. This is equivalent to finding P ( pˆ ≤ 0.20 ) . First we need the sampling distribution of pˆ . The mean is µ pˆ= p= 0.3. Since the population (all travelers passing through Guadalajara, Mexico) is considerably larger than 10 times the sample size ( n = 100 ), the standard deviation is 0.3 ( 0.7 ) = np 100 = p ) 100 (= 0.7 ) 70 are both at least 10, the = 0.0458. Since ( 0.3) 30 and n (1 −= 100 sampling distribution is approximately Normal. This means that the probability can be computed 0.20 − 0.30  = = = as P ( pˆ ≤ 0.20 ) P  z ≤ ) 0.0146. (b) The claim is unlikely to be true. There  P ( z ≤ −2.18 0.0458   is only a 1.5% chance that we would find a sample with as few red lights as we saw in our sample if their claim was true.

σ pˆ =

R7.6 (a) Let X denote the WAIS score for a randomly selected individual. 105 − 100   P ( X ≥ 105 ) = P  z ≥  = P ( z ≥ 0.33) = 0.3707. (b) The mean is µ x =100 and the standard 15   105 − 100  15  deviation is= σ x = 1.9365 . (c) P ( x ≥ 105 ) = P  z ≥  = P ( z ≥ 2.582 ) = 0.0049 (d) The 1.9365  60  answer to (a) could be quite different. The answer to (b) would be the same because the mean and standard deviation do not depend on the shape of the population. Because of the large sample size, the answer we gave for (c) would still be fairly reliable because of the central limit theorem. 0.7 R7.7 (a) The mean is µ x = 0.5 and the standard deviation is= σ x = 0.0990. (b) Because we have a 50 large sample size (larger than 30), the Central Limit Theorem applies and 0.6 − 0.5   P ( x ≥ 0.6 ) = P  z ≥  = P ( z ≥ 1.01) = 0.1562. 0.0990  

Chapter 7: Sampling Distributions

169

AP Statistics Practice Test (page 459)

T7.1 c. The statistic is a measure of the sample and the parameter is a measure of the population. T7.2 c. Sample size has no effect on the bias of an estimate, but larger samples will reduce the variability of an estimate. T7.3 c. To use the Normal approximation, both np and n (1 − p ) must be at least 10. In all options other than c this condition is not met. T7.4 a. The central limit theorem is not needed when the original distribution is Normal; the distribution of the sample mean is always Normal in that case. T7.5 b. Since a sample of 3% of the undergraduates from Ohio State University consists of approximately 1200 students whereas a sample of 3% of the undergraduates from Johns Hopkins consists of just 60 students, the estimate from Ohio State University will have less sampling variability. T7.6 b. The variation in the sample mean is related to the square root of the sample size, so if you double the sample size, the variation is reduced by the square root of 2. T7.7 b. The sampling distribution would be only approximately Normal with mean equal to the population proportion (0.55 in this case) and standard deviation equal to

( 0.55)( 0.45) 250

= 0.03.

T7.8 e. The sampling distribution has information about how the sample mean varies from sample to sample, not what any sample itself looks like. T7.9 e. We do not have a sample size large enough to use the central limit theorem and we do not know that the distribution of fill amounts is Normally distributed. T7.10 e. The distribution of the average amount of pay will not be Normal because there are only three possible outcomes, x = 40, 60, or 80. T7.11 Sample statistic A provides the best estimate of the parameter. Both statistics A and B appear unbiased, while statistic C appears to be biased (low). In addition, statistic A has lower variability than statistic B. In this situation, we want low bias and low variability, so statistic A is the best choice. T7.12 (a) We cannot calculate this probability because we do not know the shape of the distribution of the amount that individual households pay for internet service. For instance, we know that “many households pay about $10” but we don’t know what percent of households are in this category. (b) The mean of the sampling distribution of the sample mean is the same as the mean of the original distribution. 10 Therefore µ= = σ x = 0.4472 because there are at least 10(500) = $28. Also we know that µ= x X 500 5000 households with internet access. (c) Since the sample size is large (much more than 30), the central limit theorem tells us that the sampling distribution of the sample mean is approximately Normal. 29 − 28   (d) P ( x > 29 ) = P  z >  = P ( z > 2.24 ) = 0.0125. There is about a 1.25% chance of getting a 0.4472   sample of 500 in which the average amount paid for the Internet is more than $29.

170

The Practice of Statistics for AP*, 4/e

T7.13 We know that µ pˆ= p= 0.22. Since the population size (all children under 6 years old) is much more than 10 times the sample size (300), we know that σ pˆ =

0.22(0.78) = 0.0239. 300

0.22 ( 0.78 ) since np 300 − p ) 300 (= 0.78 ) 234 are both = = 0.0239. Finally,= ( 0.22 ) 66 and n (1= 300 at least 10, the sampling distribution of the sample proportion is approximately Normal. This leads to the 0.2 − 0.22  = = = following probability calculation: P ( pˆ > 0.2 ) P  z > ) 0.7995. There is  P ( z > −0.84 0.0239   about an 80% chance that a sample of 300 children will yield more than 20% who live in households with incomes less than the official poverty level.

= σ pˆ

Chapter 7: Sampling Distributions

171