Distribution of the Sample Mean

Distribution of the Sample Mean Estimation of the population mean • In many investigations the data of interest take on a wide range of possible val...

Author: Evangeline Paul

3 downloads 0 Views 1MB Size

Report

Download PDF

Recommend Documents

Sampling Distribution of Sample Mean

The central limit theorem The distribution of the sample proportion The distribution of the sample mean

Properties of the Sample Mean

The mean and the std. dev. of the sample mean

PHP 2510 Inference about population mean; distribution of the sample mean; standard error; central limit theorem

Program. Statistical inference Statistical models, estimation and confidence intervals. The sample mean. Distribution of a sample mean

The Distributions of the Sample Mean and Sample Proportion

Continuous Probabilities: Normal Distribution, Confidence Intervals for the Mean, and Sample Size. The Normal Distribution

DISTRIBUTION OF SAMPLE MEANS

CH18 Sampling Distribution Models. Sampling distributions. Sampling distribution of sample proportion and sample mean. Central Limit Theorem

Sample Mean Range

Distribution of Sample Means

s) is. sample variance sample mean

SAMPLE. Not For Distribution

SAMPLE DISTRIBUTION AGREEMENT

Chapter 7: Probability and Samples: The Distribution of Sample Means

The Weighted Mean Standard Deviation Distribution: A Geometrical Framework

Analysis of the distribution and the mean values of nuchal translucency in singleton pregnancies

THE MEAN AND STANDARD DEVIATION OF THE DISTRIBUTION OF GROUP ASSEMBLY SUMS* PAUL S. DWYER

On the Linear Relation Between the Mean and the Standard Deviation of a Response Time Distribution

Section 7. Sampling Distribution of the Mean and the Central Limit Theorem

Note: When referring to a population, we say "mean" rather than "sample mean". Sample: a small group of subjects selected from the population

The normal distribution in some constrained sample spaces

KEYWORDS: t-distribution, degrees of freedom, pooled sample variance,

Distribution of the Sample Mean

Estimation of the population mean • In many investigations the data of interest take on a wide range of possible values. • Examples: attachment loss (mm) and DMFS. • With this type of data it is often of interest to estimate the population mean, μ. • A common estimator for μ is the sample mean, 𝑋 • In this lecture we will focus on the sampling distribution of 𝑋

Example: Fluoride Varnish Study* • Children in Yakima WA were randomized to two different methods of fluoride varnish delivery • Followed for ~3 years • Outcome of interest was number of surfaces with new decay * Weinstein, P. et al. Caries Research 2009;43(6):484-90.

Example: Fluoride Varnish Study • Can summarize the observed data with the sample mean and standard deviation • The sample mean is used as an estimate of the true population mean. • 𝑋 = 7.4 for the “Standard” group • How good of an estimate is it? means ± standard deviations

Example: Fluoride Varnish Study • 𝑋is a random variable. • Its value is determined by which people are randomly chosen to be in the sample. • Many possible samples, many possible 𝑋’s.

X 7.4

0

20

40

X 7.8

0

X 7.8

0

20

40

20

40

0

20

40

0

20

40

20

40

20

40

0

20

40

0

20

40

40

0

20

40

0

20

40

0

20

40

0

20

40

20

40

20

40

X 6.9

0

X 7.2

0

40

X 7.0

X 7.6

0

20

X 8.1

X 7.5

X 7.0

0

20

X 6.8

X 8.0

X 7.8

X 6.6

0

0

X 7.4

X 7.0

0

40

X 7.9

X 7.4

0

20

X 7.3

20

40

X 7.0

0

20

40

Example: Fluoride Varnish Study • In our study we only see one occurrence of the sample mean. • We will have a better idea of how good our one estimate is if we have good knowledge of how 𝑋 behaves. • That is, if we know the probability distribution of 𝑋.

X 7.4

0

20

40

X 7.8

0

X 7.8

0

20

40

20

40

0

20

40

0

20

40

20

40

20

40

0

20

40

0

20

40

40

0

20

40

0

20

40

0

20

40

0

20

40

20

40

20

40

X 6.9

0

X 7.2

0

40

X 7.0

X 7.6

0

20

X 8.1

X 7.5

X 7.0

0

20

X 6.8

X 8.0

X 7.8

X 6.6

0

0

X 7.4

X 7.0

0

40

X 7.9

X 7.4

0

20

X 7.3

20

40

X 7.0

0

20

40

The Central Limit Theorem • An important result in probability theory states that the probability distribution for averages (i.e. 𝑋 ) is the Normal distribution* • The size of the sample needs to be reasonably large • This result will often hold, regardless of the distribution of the original data *some restrictions will apply

Probability distribution for 𝑿

μ

Approximation with the Normal distribution not as good with only 10 observations

More on the distribution of 𝑿 • The expected value of 𝑋 is μ • 𝑋 is “unbiased” • On average, 𝑋 is perfect as an estimator of μ

More on the distribution of 𝑿 • The standard deviation of 𝑋 is 𝑆𝐸 𝑋 =

𝜎 𝑛

• 𝜎 is the standard deviation in the population. • n is the number of people in the sample

• It is called the standard error of the mean or SEM

More on the distribution of 𝑿 𝝈 𝑺𝑬 𝑿 = 𝒏 • One can think of the 𝑆𝐸 𝑋 as the average error that 𝑋 makes when estimating μ, or the precision of the estimate. • The precision of 𝑋 is better (SEM is smaller) when the sample is larger (larger n) • The precision is worse (SEM is greater) when the population is more variable (has greater 𝜎)

More on the distribution of 𝑿 • By the Central Limit Theorem when n is reasonably large, then the distribution of 𝑋 will be approximately Normal, with mean μ, and standard deviation 𝜎 𝑛 2 𝜎 𝑋 ~ 𝑁𝑜𝑟𝑚𝑎𝑙 𝜇, 𝑛

Example: Birthweight data • The histogram shows the distribution of birthweights at a Boston hospital. • Estimate the probability that the mean birthweight of the next 20 babies born will be greater than 120 oz.

𝜇 = 112 oz 𝜎 = 20.6 𝑜𝑧

Law of Large Numbers 2 𝜎 𝜇,

• Recall 𝑋 ~ 𝑁𝑜𝑟𝑚𝑎𝑙 𝑛 • As the n gets large, the distribution of 𝑋 is forced to be closer and closer to μ.

Law of Large Numbers 2 𝜎 𝜇,

• Recall 𝑋 ~ 𝑁𝑜𝑟𝑚𝑎𝑙 𝑛 • As the n gets large, the distribution of 𝑋 is forced to be closer and closer to μ. • With larger sample sizes 𝑋 provides a better estimate of μ. • The same is true for the sample standard deviation s. • As the sample size increases, s should get closer to the population standard deviation σ.

Standard Error versus Standard Deviation Standard Deviation: describes the variability of a population or a sample.

Standard Error: describes the variability of an estimator that is usually a function of the whole sample.

Confidence intervals for the mean 𝑋−𝜇 ~𝑁(0,1) 𝜎 𝑛

• If n is large enough we can use the result that to a construct confidence interval for μ. • However, this would result in a formula that involves σ, a value that we don’t usually know. • In practice we will estimate σ with the sample standard deviation, s. • Substituting the random variable s for σ will alter the distribution of the Z score slightly.

The t distribution The distribution of the statistic

𝑋−𝜇 𝑇= 𝑠 𝑛 is called a “t” distribution with n-1 “degrees of freedom”, and is denoted by tn-1

The t distribution 𝑋−𝜇 𝑇= 𝑠 𝑛 • The shape of the t distribution is similar to the Normal distribution, but it has higher variability • How much higher depends on the degrees of freedom, which depends on the sample size.

The t distribution 𝑋−𝜇 𝑇= 𝑠 𝑛 • The larger the sample, the less variability. • t distributions with higher degrees of freedom are more similar to the Normal distribution.

Confidence intervals for the mean • If X is Normal or n is large, then 𝑇 = with n-1 degrees of freedom and 𝑃 −𝑡𝑛−1,0.975