CH18 Sampling Distribution Models. Sampling distributions. Sampling distribution of sample proportion and sample mean. Central Limit Theorem

CH18 Sampling Distribution Models  Sampling distributions.  Sampling distribution of sample proportion and sample mean.  Central Limit Theorem. ...
Author: Arleen Curtis
6 downloads 2 Views 757KB Size
CH18 Sampling Distribution Models

 Sampling distributions.  Sampling distribution of sample proportion and sample mean.  Central Limit Theorem.

Motivation • Who is the next president of the U.S.?

• The opinion of the whole population is not known until the election day. • However, we can draw random samples to estimate the population.

Motivation Sample • • • • • • • • • • • • • • •

Sample Proportion for R

ORRROROOOORRROROORRR RROORROOOOORROROOORO RROROOROORRRRROOOOOR OOROORROORORRRROORRR ROROOOOOROORORRORORR OROOROROORRROORROORO ROROOOORRORORRRRRROO OOOOROROOORORRROOROR RORRORORORRRROOOROOO RRRROOOROOORRROOORRO RRRROORRRORORRORRORR ROOORROOOROORROROROR RROORORORRRRROORRROR OORROROROOOROOROROOO RROOROROORRRROROOORR

------11/20=0.55 ------8/20=0.4 ------10/20=0.5 ------11/20=0.55 ------9/20=0.45 ------9/20=0.45 ------11/20=0.55 ------8/20=0.4 ------10/20=0.5 ------10/20=0.5 ------14/20=0.7 ------9/20=0.45 ------13/20=0.65 ------7/20=0.35 ------11/20=0.55

Motivation • The histogram of 15 sample proportions (sample size 20)

Motivation • The histogram of 150 sample proportions (sample size 20)

Motivation • The histogram of 1500 sample proportions (sample size 20)

Motivation • The histogram of 1500 sample proportions (sample size 200)

Motivation • The histogram of 1500 sample proportions (sample size 400)

Motivation • The histogram of 1500 sample proportions (sample size 800) • What distribution fits to the data well?

Sampling distribution • For a random sample, the sample proportion is random. • In other words, sample proportion is a random variable. What is the distribution of this random variable? • The distribution of the sample proportions is approximately Normal.

Review • In a population, there are usually parameters of interest whose values are unknown. e.g. The population proportion who supports Romney • We use sample estimators to estimate the values of those parameters. e.g. The sample proportion who supports Romney • The sample estimators are called sample statistics. e.g. Sample proportion is a statistic

Sampling distribution • The sampling distribution of a statistic is the distribution of values taken by that statistic in all possible samples of the same size from the same population. e.g. histogram of the sample proportion favoring Romney (1500 samples with sample size 800)

Sampling distribution of the sample proportion The sampling distribution of pˆ is never exactly normal. But as the sample size increases, the sampling distribution of pˆ becomes approximately normal. The normal approximation is most accurate for any fixed n when p is close to 0.5, and least accurate when p is near 0 or near 1.

Sampling distribution • Caution: sampling distribution is not the same as the distribution of a sample. The sampling distribution of a sample proportion is approximately Normal. For the sample “ORRROROOOORRROROORR”, the distribution of this sample is Romney

11

0.55

Obama

9

0.45

Sampling distribution of sample proportion • Population proportion p=

number of individual s of interest in the population total number of individual s in the population

• Sample proportion number of individual s of interest in the sample pˆ = total number of individual s in the sample

Sampling distribution of sample proportion • Provided that the sampled values are independent (e.g. a simple random sample from a large population) and the sample size is large enough, the sampling distribution of the sample proportion ^ p of samples of size n is approximately Normal with mean p and standard deviation • Or

pˆ ~ɺ N ( p,

p (1 − p ) . n

p(1 − p) ) n

Calculating Probability with TI calculator

P (a < pˆ < b) = normalcdf (a, b, p, P ( pˆ < b) = normalcdf (−1E 99, b, p, P ( pˆ > a ) = normalcdf (a,1E 99, p,

p (1 − p ) )) n p (1 − p ) )) n p (1 − p ) )) n

Example

pˆ ~ɺ N ( p,

p(1 − p ) 0.13(1 − 0.13) ) = N (0.13, = 0.035) n 90

Example • Suppose the population proportion of supporting Romney is 50%. How likely will we get a sample of size 600 with sample proportion at least 48%? • What if the population proportion is in fact 40%? • What if the population proportion is actually 60%?

Assumptions and conditions • Independence Assumption: the sampled values must be independent of each other Randomization Condition: a random sample 10% Condition: the sample size n must be no larger than 10% of the population • Sample Size Assumption: the sample size n must be large enough Success/Failure Condition: the sample size has to be big enough so that we expect at least 10 successes and at least 10 failures.

Sampling distribution of sample mean • A fair die is tossed 10,000 times. Below is the histogram of the outcomes.

Sampling distribution of sample mean • Two fair dice are tossed 10,000 times. Below is the histogram of average of the outcomes.

Sampling distribution of sample mean • Four fair dice are tossed 10,000 times. Below is the histogram of average of the outcomes.

Sampling distribution of sample mean • Eight fair dice are tossed 10,000 times. Below is the histogram of average of the outcomes.

Sampling distribution of sample mean • Sixteen fair dice are tossed 10,000 times. Below is the histogram of average of the outcomes.

Central limit theorem • The mean of a random sample has a sampling distribution whose shape can be approximated by a Normal model. The larger the sample, the better the approximation will be. • If the sample is from a population with Normal distribution, then the approximation is exact.

The central limit theorem Central Limit Theorem: When randomly sampling from any population with mean µ and standard deviation σ, when n is large enough, the sampling distribution of

Population with strongly skewed distribution

Sampling distribution of x for n = 10 observations

x is approximately normal: ~ N(µ, σ/√n). Sampling distribution of x for n = 2 observations

Sampling distribution of x for n = 25 observations

Income distribution Let’s consider the very large database of individual incomes from the Bureau of Labor Statistics as our population. It is strongly right skewed. 

We take 1000 SRSs of 100 incomes, calculate the sample mean for each, and make a histogram of these 1000 means.



We also take 1000 SRSs of 25 incomes, calculate the sample mean for each, and make a histogram of these 1000 means.

Which histogram corresponds to samples of size 100? 25?

Sampling distribution of sample mean • When a random sample is drawn from any population with mean µ and standard deviation σ, its sample mean x, has a sampling distribution with the same mean µ but σ . standard deviation n • No matter what population the random sample comes from, the shape of the sampling distribution is approximately Normal as long as the sample size is large enough. • The larger the sample used, the more closely the Normal approximates the sampling distribution for the mean.

Calculating Probability for sample mean with TI calculator

P(a < x < b) = normalcdf (a, b, µ ,

σ n

σ

P( x < b) = normalcdf (−1E 99, b, µ , P( x > a ) = normalcdf (a,1E 99, µ ,

)

n

σ n

)

)

The amount of soda in cans of a particular brand has a mean of 12 oz and a standard deviation of .2 oz. If you select random samples of 50 cans, what percentage of the sample means would be less than 11.95 oz? SODA

• Assume that the systolic blood pressure of 30-year-old males is normally distributed, with an average of 122 mmHg and a standard deviation of 10mmHg. A random sample of 16 men from this age group is selected. • Calculate the probability that the average blood pressure of the sample will be greater than 125mmHg? • Calculate the probability that the average blood pressure of this sample will be between 118 and 124 mmHg? • Calculate the probability that the blood pressure of an individual male from this population will be between 118 and 124mmHg?

Suggested exercises from the textbook:

Ch18 5, 7, 11, 15, 17, 19, 25, 27, 33, 37, 39, 41, 43

Suggest Documents