Distribution of the Sample Mean
Estimation of the population mean • In many investigations the data of interest take on a wide range of possible values. • Examples: attachment loss (mm) and DMFS. • With this type of data it is often of interest to estimate the population mean, μ. • A common estimator for μ is the sample mean, 𝑋 • In this lecture we will focus on the sampling distribution of 𝑋
Example: Fluoride Varnish Study* • Children in Yakima WA were randomized to two different methods of fluoride varnish delivery • Followed for ~3 years • Outcome of interest was number of surfaces with new decay * Weinstein, P. et al. Caries Research 2009;43(6):484-90.
Example: Fluoride Varnish Study • Can summarize the observed data with the sample mean and standard deviation • The sample mean is used as an estimate of the true population mean. • 𝑋 = 7.4 for the “Standard” group • How good of an estimate is it? means ± standard deviations
Example: Fluoride Varnish Study • 𝑋is a random variable. • Its value is determined by which people are randomly chosen to be in the sample. • Many possible samples, many possible 𝑋’s.
X 7.4
0
20
40
X 7.8
0
X 7.8
0
20
40
20
40
0
20
40
0
20
40
20
40
20
40
0
20
40
0
20
40
40
0
20
40
0
20
40
0
20
40
0
20
40
20
40
20
40
X 6.9
0
X 7.2
0
40
X 7.0
X 7.6
0
20
X 8.1
X 7.5
X 7.0
0
20
X 6.8
X 8.0
X 7.8
X 6.6
0
0
X 7.4
X 7.0
0
40
X 7.9
X 7.4
0
20
X 7.3
20
40
X 7.0
0
20
40
Example: Fluoride Varnish Study • In our study we only see one occurrence of the sample mean. • We will have a better idea of how good our one estimate is if we have good knowledge of how 𝑋 behaves. • That is, if we know the probability distribution of 𝑋.
X 7.4
0
20
40
X 7.8
0
X 7.8
0
20
40
20
40
0
20
40
0
20
40
20
40
20
40
0
20
40
0
20
40
40
0
20
40
0
20
40
0
20
40
0
20
40
20
40
20
40
X 6.9
0
X 7.2
0
40
X 7.0
X 7.6
0
20
X 8.1
X 7.5
X 7.0
0
20
X 6.8
X 8.0
X 7.8
X 6.6
0
0
X 7.4
X 7.0
0
40
X 7.9
X 7.4
0
20
X 7.3
20
40
X 7.0
0
20
40
The Central Limit Theorem • An important result in probability theory states that the probability distribution for averages (i.e. 𝑋 ) is the Normal distribution* • The size of the sample needs to be reasonably large • This result will often hold, regardless of the distribution of the original data *some restrictions will apply
Probability distribution for 𝑿
μ
Approximation with the Normal distribution not as good with only 10 observations
More on the distribution of 𝑿 • The expected value of 𝑋 is μ • 𝑋 is “unbiased” • On average, 𝑋 is perfect as an estimator of μ
More on the distribution of 𝑿 • The standard deviation of 𝑋 is 𝑆𝐸 𝑋 =
𝜎 𝑛
• 𝜎 is the standard deviation in the population. • n is the number of people in the sample
• It is called the standard error of the mean or SEM
More on the distribution of 𝑿 𝝈 𝑺𝑬 𝑿 = 𝒏 • One can think of the 𝑆𝐸 𝑋 as the average error that 𝑋 makes when estimating μ, or the precision of the estimate. • The precision of 𝑋 is better (SEM is smaller) when the sample is larger (larger n) • The precision is worse (SEM is greater) when the population is more variable (has greater 𝜎)
More on the distribution of 𝑿 • By the Central Limit Theorem when n is reasonably large, then the distribution of 𝑋 will be approximately Normal, with mean μ, and standard deviation 𝜎 𝑛 2 𝜎 𝑋 ~ 𝑁𝑜𝑟𝑚𝑎𝑙 𝜇, 𝑛
Example: Birthweight data • The histogram shows the distribution of birthweights at a Boston hospital. • Estimate the probability that the mean birthweight of the next 20 babies born will be greater than 120 oz.
𝜇 = 112 oz 𝜎 = 20.6 𝑜𝑧
Law of Large Numbers 2 𝜎 𝜇,
• Recall 𝑋 ~ 𝑁𝑜𝑟𝑚𝑎𝑙 𝑛 • As the n gets large, the distribution of 𝑋 is forced to be closer and closer to μ.
Law of Large Numbers 2 𝜎 𝜇,
• Recall 𝑋 ~ 𝑁𝑜𝑟𝑚𝑎𝑙 𝑛 • As the n gets large, the distribution of 𝑋 is forced to be closer and closer to μ. • With larger sample sizes 𝑋 provides a better estimate of μ. • The same is true for the sample standard deviation s. • As the sample size increases, s should get closer to the population standard deviation σ.
Standard Error versus Standard Deviation Standard Deviation: describes the variability of a population or a sample.
Standard Error: describes the variability of an estimator that is usually a function of the whole sample.
Confidence intervals for the mean 𝑋−𝜇 ~𝑁(0,1) 𝜎 𝑛
• If n is large enough we can use the result that to a construct confidence interval for μ. • However, this would result in a formula that involves σ, a value that we don’t usually know. • In practice we will estimate σ with the sample standard deviation, s. • Substituting the random variable s for σ will alter the distribution of the Z score slightly.
The t distribution The distribution of the statistic
𝑋−𝜇 𝑇= 𝑠 𝑛 is called a “t” distribution with n-1 “degrees of freedom”, and is denoted by tn-1
The t distribution 𝑋−𝜇 𝑇= 𝑠 𝑛 • The shape of the t distribution is similar to the Normal distribution, but it has higher variability • How much higher depends on the degrees of freedom, which depends on the sample size.
The t distribution 𝑋−𝜇 𝑇= 𝑠 𝑛 • The larger the sample, the less variability. • t distributions with higher degrees of freedom are more similar to the Normal distribution.
Confidence intervals for the mean • If X is Normal or n is large, then 𝑇 = with n-1 degrees of freedom and 𝑃 −𝑡𝑛−1,0.975