Chapter 7 Online Quiz Chapter 7: Sampling Distributions 1. A news magazine claims that 30% of all New York City police officers are overweight. Indignant at this claim, the New York City police commissioner conducts a survey in which 200 randomly selected New York City police officers are weighed. Of the surveyed officers, 52, or 26%, turn out to be overweight. Which of the following statements about this situation is true? *A. The number 26% is a statistic. AR. Correct. The number 26% refers to the percentage of sampled officers who are overweight, and therefore it is a statistic. B. The number 30% is a statistic. BR. Incorrect. The number 30% is a parameter, since it is a characteristic of the entire population of interest, that is, the population of all New York City police officers. C. The number 26% is a parameter. CR. Incorrect. The number 26% is a statistic, since it is measured from the sample of 200 police officers and would very probably be different if the sample of 200 officers contained different officers. 2. In a simple random sample of 1000 Americans, it was found that 61% were satisfied with the service provided by the dealer from which they bought their car. In a simple random sample of 1000 Canadians, 58% said that they were satisfied with the service provided by their car dealer. Which of the following statements concerning the sampling variability of these statistics is true? *A. The sampling variability is about the same in both cases. AR. Correct. As long as the population is much larger than the sample (say, at least 10 times as large), the spread of the sampling distribution for a sample of fixed size n will be approximately the same for any population size. Here, n = 1000, and the populations of car owners of both the United States and Canada are at least 10 times this large. Therefore, the sampling variability associated with “the proportion satisfied out of 1000” is about the same in both cases. B. The sampling variability is much smaller for the statistic based on the sample of 1000 Canadians because the population of Canada is smaller than that of the United States and, therefore the sample is a larger proportion of the population. BR. Incorrect. As long as the population is much larger than the sample (say, at least 10 times as large), the spread of the sampling distribution for a sample of fixed size n will be approximately the same for any population size. In this case, the population sizes are not relevant, since both populations are at least 10 times as large as the sample size n = 1000. C. The sampling variability is much larger for the statistic based on the sample of 1000 Canadians, because Canada has a lower population density than the United States and having subjects living farther apart always increases sampling variability. CR. Incorrect. The additional information supplied in this answer is irrelevant to the question of interest. The physical distance between subjects or units need not, in general, have any effect on sampling variability. 3. In a statistics class of 250 students, each student is instructed to toss a coin 20 times and record the value of pö , the sample proportion of heads. The instructor then makes a histogram of © W.H. Freeman/BFW Publishers 2011

The Practice of Statistics for AP*, 4e

1

Chapter 7 Online Quiz the 250 values of pˆ obtained. In a second statistics class of 200 students, each student is told to toss a coin 40 times and record the value of pˆ , the sample proportion of heads. The instructor then makes a histogram of the 200 values of pˆ obtained. Which of the following statements regarding the two histograms of pˆ -values is true? A. The first class’s histogram is more biased because it is derived from a smaller number of tosses per student. AR. Incorrect. The sample proportion pˆ is an unbiased estimator of the parameter p. This property holds regardless of the number of trials (tosses) used to get each estimate of pˆ . *B. The first class’s histogram has greater spread (variability) because it is derived from a smaller number of tosses per student. BR. Correct. Larger samples (based on more trials or tosses) yield sampling distributions with smaller spreads. If the coin had been tossed 100 times by each student, the variability would have been even smaller. C. The first class’s histogram has less spread (variability) because it is derived from a larger number of students. CR. Incorrect. The number of students (200 or 250) is the size of each “simulation.” It doesn’t affect the variability of the estimate. Using more students would give a more accurate picture of the sampling distribution. However, the variability of the estimator depends only on the number of trials (tosses) used to get each estimate, not on how large the “simulation” is. 4. As part of a promotion for a new type of cracker, free samples are offered to shoppers in a local supermarket. The probability that a shopper will buy a package of crackers after tasting the free sample is 0.2. Different shoppers can be regarded as independent trials. Let pˆ be the sample proportion of the next n shoppers that buy a packet of crackers after tasting a free sample. How large should n be so that the standard deviation of pˆ is no more than 0.01? A. 4 p (1 − p ) AR. Incorrect. The value n = 16 makes the variance of pˆ , that is, , equal to 0.01. You n need to make the square root of this expression equal to 0.01. This is not accomplished by taking the square root of 16. B. 16 p (1 − p ) , equal to 0.01. We BR. Incorrect. The value n = 16 makes the variance of pˆ , that is, n want the standard deviation of pˆ to be 0.01. *C. 1600 CR. Correct. Substituting n = 1600 and p = 0.2 into the formula for the variance of pˆ , that is, p (1 − p ) , yields (0.2)(0.8)/1600 = 0.0001. The standard deviation of pˆ is therefore √0.0001 = n 0.01.

© W.H. Freeman/BFW Publishers 2011

The Practice of Statistics for AP*, 4e

2

Chapter 7 Online Quiz 5. As part of a promotion for a new type of cracker, free samples are offered to shoppers in a local supermarket. The probability that a shopper will buy a package of crackers after tasting the free sample is 0.2. Different shoppers can be regarded as independent trials. Let pˆ be the sample proportion of the next 100 shoppers that buy a package of crackers after tasting a free sample. Which of the following best describes the sampling distribution of the statistic pˆ ? A. It is approximately Normal with mean μ = 0.2 and standard deviation σ = 0.0016. AR. Incorrect. The distribution of pˆ is approximately Normal with mean 0.2, but the standard p (1 − p ) deviation is not 0.0016. The variance of pˆ is = 0.0016. You have neglected to take the n square root. *B. It is approximately Normal with mean μ = 0.2 and standard deviation σ = 0.04. BR. Correct. The distribution of pˆ is approximately Normal with mean p = 0.2 and standard

p (1 − p ) = √[(0.2)(0.8)/100] = √.0016 = 0.04. n C. It cannot be approximated by a Normal distribution. CR. Incorrect. Since n = 100 and p = 0.2 satisfy the conditions np ≥ 10 and n(1 – p) ≥ 10, we can use the Normal distribution to approximate the sampling distribution of pˆ .

deviation

6. As part of a promotion for a new type of cracker, free samples are offered to shoppers in a local supermarket. The probability that a shopper will buy a package of crackers after tasting the free sample is 0.2. Different shoppers can be regarded as independent trials. Let pˆ be the sample proportion of the next 100 shoppers that buy a package of crackers after tasting a free sample. The probability that fewer than 30% of these individuals buy a package of crackers after tasting a sample is approximately (without using the continuity correction) A. 0.3. AR. Incorrect. You have mistaken the value 30% for the desired probability, P( pˆ ≤ 0.3). The statistic pˆ has an approximately Normal distribution with mean 0.2 and standard deviation 0.04. *B. 0.9938. BR. Correct. The statistic pˆ has an approximately Normal distribution with mean 0.2 and standard deviation 0.04. You need to find P( pˆ ≤ 0.3). Using standardization and the Normal table, we get P( pˆ ≤ 0.3) ≈ P(Z ≤ (0.3 – 0.2)/0.04) = P(Z ≤ 2.5) = 0.9938. C. 0.0062. CR. Incorrect. The statistic pˆ has an approximately Normal distribution with mean 0.2 and standard deviation 0.04. You need to find P( pˆ ≤ 0.3). Instead, you have calculated P( pˆ ≥ 0.3). 7. Suppose that you are a student worker in the statistics department and agree to be paid according to the “random pay” system. Each week, the chair of the department flips a coin. If the coin comes up heads, your pay for the week is $80. If it comes up tails, your pay for the week is $40. You work for the department for 100 weeks (at which point you have learned enough © W.H. Freeman/BFW Publishers 2011

The Practice of Statistics for AP*, 4e

3

Chapter 7 Online Quiz probability to know that the “random pay” system is not to your advantage). The probability that X , your average earnings in the first two weeks, is greater than $65 is *A. 0.2500. AR. Correct. For X to be greater than $65, you must earn $80 in both weeks. Because the determinations of your salary in different weeks are independent, the probability of earning $80 in both weeks is equal to the probability of the coin coming up heads both times, that is, (0.5)(0.5) = 0.25. B. 0.3333. BR. Incorrect. The possible values for x are $40, $60, and $80 with probabilities 0.25, 0.50, and 0.25, respectively. You would get this answer if you incorrectly treated these three outcomes as equally likely. C. 0.5000. CR. Incorrect. This is the probability that you earn more than $65 in a single week. 8. The sampling distribution of the sample mean x is formed from random samples of size 16 taken from a population with mean μ = 64 and standard deviation σ = 10. What are the mean and standard deviation of the sampling distribution of x ? A. mean = 64, standard deviation = 0.625 AR. Incorrect. You have the correct mean, but you should divide by √ n, not by n, when computing the standard deviation of x . B. mean = 8, standard deviation = 2.5 BR. Incorrect. You have the correct standard deviation, but you should not take the square root of the population mean to find the mean of x . *C. mean = 64, standard deviation = 2.5 CR. Correct. The sampling distribution of x has mean μ = 64 and standard deviation σ/√n = 10/√16 = 10/4 = 2.5. 9. The scores of individual students on the American College Testing (ACT) program composite college entrance examination have a Normal distribution with mean 18.6 and standard deviation 6.0. At Northside High, 36 seniors take the ACT test. If the scores at this school have the same distribution as the national scores, the sampling distribution of the average (sample mean) score X for these 36 students is A. approximately Normal, but the approximation is poor. AR. Incorrect. If a population is Normally distributed with mean μ and standard deviation σ, then the sample mean x of n independent observations is Normally distributed with mean μ and standard deviation σ/√n. This is the exact distribution, not an approximation. B. approximately Normal, but the approximation is good. BR. Incorrect. If a population is Normally distributed with mean μ and standard deviation σ, the sample mean x of n independent observations is Normally distributed with mean μ and standard deviation σ/√n. This is the exact distribution, not an approximation. *C. exactly Normal.

© W.H. Freeman/BFW Publishers 2011

The Practice of Statistics for AP*, 4e

4

Chapter 7 Online Quiz CR. Correct. If a population is Normally distributed with mean μ and standard deviation σ, the sample mean x of n independent observations is Normally distributed with mean μ and standard deviation σ/√n. This is the exact distribution, not an approximation. 10. The duration of Alzheimer’s disease, from the onset of symptoms until death, ranges from 3 to 20 years, with a mean of 8 years and a standard deviation of 4 years. The administrator of a large medical center randomly selects the medical records of 30 deceased Alzheimer’s patients and records the duration of the disease for each one. Find the probability that the average duration of the disease for the 30 patients will exceed 8.25 years. A. 0.6331 AR. Incorrect. You have calculated the probability that the average duration is less than 8.25 years. *B. 0.3669 BR. Correct. We seek P( x > 8.25), where x = the average duration of the disease for the 30 patients. By the central limit theorem, x has an approximately Normal distribution with mean μ = 8 and standard deviation σ/√n = 4/√30 ≈ 0.73. Using standardization and the Normal table, we get P( x > 8.25) ≈ P(z > (8.25 – 8)/0.73) ≈ P(z > 0.34) = 1 – 0.6331 = 0.3669. C. 0.4761 CR. Incorrect. We seek P( x > 8.25), where x = the average duration of the disease for the 30 patients. By the central limit theorem, x has an approximately Normal distribution with mean μ = 8 and standard deviation σ/√n = 4/√30 ≈ 0.73. You have done the calculation of P( x > 8.25) using the original standard deviation σ, rather than σ/√n. 11. The duration of Alzheimer’s disease, from the onset of symptoms until death, ranges from 3 to 20 years, with a mean of 8 years and a standard deviation of 4 years. The administrator of a large medical center randomly selects the medical records of 30 deceased Alzheimer’s patients and records the duration of the disease for each one. Find the probability that the average duration of the disease for the 30 patients will lie within 1 year of the overall mean of 8 years. *A. 0.8294 AR. Correct. We seek P(7 < x < 9), where x = the average duration of the disease for the 30 patients. By the central limit theorem, x has an approximately Normal distribution with mean μ = 8 and standard deviation σ/√n = 4/√30 ≈ 0.73. Using standardization and the Normal table, we get P(7 < x < 9) ≈ P((7 – 8)/0.73 < z < (9 – 8)/0.73) ≈ P(−1.37 < z < 1.37) = 0.9147 – 0.0853 = 0.8294. B. 0.1706 BR. Incorrect. You have determined the probability that the average duration lies outside the given interval. C. 0.4147 CR. Incorrect. You have determined the probability that the average duration lies at most 1 year above the overall mean of 8 years. You must also consider the possibility that the average duration lies below the overall mean.

© W.H. Freeman/BFW Publishers 2011

The Practice of Statistics for AP*, 4e

5

Chapter 7 Online Quiz 12. The duration of Alzheimer’s disease, from the onset of symptoms until death, ranges from 3 to 20 years, with a mean of 8 years and a standard deviation of 4 years. The administrator of a large medical center randomly selects the medical records of 30 deceased Alzheimer’s patients and records the duration of the disease for each one. Find the value L such that there is a probability of 0.99 that the average duration of the disease for the 30 patients lies less than L years above the overall mean of 8 years. A. 0.72 AR. Incorrect. Recall that you must set the standardized unknown value equal to a value of z, the standard Normal distribution, when solving a problem of this type. In this case, you have used the value of the area, 0.99, in place of the required value of z. *B. 1.70 BR. Correct. We seek L such that P( x < 8 + L) = 0.99, where x = the average duration of the disease for the 30 patients. By the central limit theorem, x has an approximately Normal distribution with mean μ = 8 and standard deviation σ/√n = 4/√30 ≈ 0.73. Thus, P( x < 8 + L) = P(( x − μ)/(σ/√n) < ((8 + L) – 8)/0.73) ≈ P(z < L/0.73), where z is the standard Normal random variable. The area in the Normal table closest to 0.99 is 0.9901, corresponding to an observation of z = 2.33. Setting L/0.73 = 2.33 and solving for L, we get L = (0.73)(2.33) ≈ 1.70. C. 2.33 CR. Incorrect. Recall that in a problem of this type, the value of z corresponding to the given area, in this case 2.33, must be set equal to the standardized unknown value.

© W.H. Freeman/BFW Publishers 2011

The Practice of Statistics for AP*, 4e

6