Confidence Intervals

Confidence Intervals Berlin Chen Department of Computer Science & Information Engineering National Taiwan Normal University Reference: 1. W. Navidi. ...
Author: Winifred Fisher
11 downloads 2 Views 2MB Size
Confidence Intervals Berlin Chen Department of Computer Science & Information Engineering National Taiwan Normal University

Reference: 1. W. Navidi. Statistics for Engineering and Scientists. Chapter 5 & Teaching Material

Introduction • We have discussed p point estimates: – pˆ as an estimate of a success probability, p – X as an estimate of population mean, 

(Bernoulli trials)

• These point estimates are almost never exactly equal to the true values they are estimating – IIn order d for f the th point i t estimate ti t to t be b useful, f l it is i necessary to t describe just how far off from the true value it is likely to be – Remember that one way y to estimate how far our estimate is from the true value is to report an estimate of the standard deviation, or uncertainty, in the point estimate

• In this chapter, we can obtain more information about the estimation precision by computing a confidence interval when the estimate is normally distributed Statistics-Berlin Chen 2

Revisit: The Central Limit Theorem • The Central Limit Theorem – Let X1,…,Xn be a random sample from a population with mean  and variance 2 (n is large enough) – Let X 

X1   X n be the sample mean n

– Let Sn  X1    X n be the sum of the sample observations. Then if n is sufficiently large,  2 • X ~ N   , n 

  

sample mean is approximately normal !

• And S n ~ N ( n , n 2 )

approximately

Statistics-Berlin Chen 3

Example • Assume that a large number of independent unbiased measurements all using the same procedure measurements, procedure, are made on the diameter of a piston. The sample mean X of the measurements is 14.0 cm (coming from a normal population due to the Central Limit Theorem ), and the uncertainty in this quantity, which is the standard de iation  X of the sample mean X , is 0 deviation 0.1 1 cm • So, we have a high level of confidence that the true diameter is in the interval (13 (13.7, 7 14 14.3). 3) This is because it is highly unlikely that the sample mean will differ from the true diameter byy more than three standard deviations

 1.96X  1X



 1X  1.96X Statistics-Berlin Chen 4

Large-Sample Confidence Interval for a Population Mean • Recall the p previous example: p Since the p population p mean will not be exactly equal to the sample mean of 14, it is best to construct a confidence interval around 14 th t is that i likely lik l tto cover th the population l ti mean – We can then quantify our level of confidence that the population mean is actually covered by the interval

• To see how to construct a confidence interval, let  represent the unknown population mean and let 2 be the unknown population variance. Let X1,…,X100 be the 100 diameters of the p pistons. The observed value of X is the mean of a large sample, and the Central Limit Theorem specifies that it comes from a normal distribution with mean  and whose standard deviation is  X   / 100

Statistics-Berlin Chen 5

Illustration of Capturing g True Mean • Here is a normal curve,, which represents p the distribution of X . The middle 95% of the curve, extending a distance of 1.96 X on either side of the population mean , is i iindicated. di t d The Th following f ll i illustrates ill t t what h th happens if X lies within the middle 95% of the distribution: 95% of the samples that could have been drawn fall into this category

95% confidence interval Statistics-Berlin Chen 6

Illustration of Not Capturing g True Mean • If the sample p mean lies outside the middle 95% of the curve: Only 5% of all the samples that could have been drawn fall into this category. For those more unusual samples l th the 95% confidence fid iinterval t l X  1.96 X fails f il tto cover the true population mean 

95% confidence interval Statistics-Berlin Chen 7

Computing g a 95% Confidence Interval • The 95% confidence interval (CI) is X  1.96 X • So, a 95% CI for the mean is 14  1.96 (0.1). We can use the sample p standard deviation as an estimate for the population standard deviation, since the sample size is large • We can say that we are 95% confident, or confident at the 95% level, that the population mean diameter for pistons lies, between 13.804 and 14.196 • Warning: The methods described here require that the data be a random sample from a population. When used f other for th samples, l th results the lt may nott be b meaningful i f l Statistics-Berlin Chen 8

Question? • Does this 95% confidence interval actuallyy cover the population mean  ? • It depends on whether this particular sample happened to be one whose h mean (i (i.e. sample l mean)) came ffrom th the middle iddl 95% of the distribution or whether it was a sample whose mean (i.e. sample mean) was unusually large or small, in the outer 5% of the population • There is no way to know for sure into which category this particular ti l sample l ffalls ll • In the long run, if we repeated these confidence intervals over and over, over then 95% of the samples will have means (i (i.e. e sample mean) in the middle 95% of the population. Then 95% of the confidence intervals will cover the population mean

Statistics-Berlin Chen 9

Extension • We are not always interested in computing 95% confidence fid intervals. i t l Sometimes, S ti we would ld lik like tto h have a different level of confidence – We can use this reasoning to compute confidence intervals with various confidence levels

• Suppose pp we are interested in 68% confidence intervals, then we know that the middle 68% of the normal distribution is in an interval that extends 1.0  X on either side id off th the population l ti mean  – It follows that an interval of the same length around X specifically, will cover the population mean for 68% of the samples that could possibly be drawn – For our example, a 68% CI for the diameter of pistons is 14.0  1 0(0 1) or (13 1.0(0.1), (13.9, 9 14 14.1) 1) Statistics-Berlin Chen 10

100(1 ( - ))% CI • Let X1,,…,X , n be a large g ((n > 30)) random sample p from a population with mean  and standard deviation , so that is approximately normal. Then a level 100(1 - )% confidence fid iinterval t l ffor  is i

X  z / 2 X – z / 2 I s the z-score that cuts off an area of  / 2 in the right-hand tail – where  X   / n . When the value of  is unknown, it can be replaced with the sample standard deviation s

Statistics-Berlin Chen 11

Z-Table

E.g., X  z / 2 X and   0.05  z / 2  1.96

Statistics-Berlin Chen 12

Particular CI’s • X

s n

• X  1.645

is a 68% interval for  s

• X  1.96 s

n

is a 90% interval for  is a 95% interval for 

n • X  2.58 s is a 99% interval for  n s • X 3 is a 99.7% interval for  n Note that even for large samples samples, the distribution of X is only approximately normal, rather than exactly normal. Therefore, the levels stated for confidence interval are approximate. Statistics-Berlin Chen 13

Example ((CI Given a Level)) • Example p 5.1: The sample p mean and standard deviation for the fill weights of 100 boxes are X = 12.05 and s = 0.1. Find an 85% confidence interval for the mean fill weight i ht off the th boxes. b Answer: To find an 85% CI, set 1 -  = .85, to obtain  = 0.15 and /2 = 0.075. We then look in the table for z0.075, the z-score z score that cuts off 7 7.5% 5% of the area in the right-hand tail. We find z0.075 = 1.44. We approximate  X  s / n  0.01. So the 85% CI is 12.05  (1.44)(0.01) or (12.0356, 12.0644). Statistics-Berlin Chen 14

Another Example ((The Level of CI)) • Question: There is a sample of 50 micro-drills with an average lifetime (expressed as the number of holes drilled before failure) was 12.68 with a standard deviation of 6.83. Suppose an engineer reported a confidence interval of (11.09, 14.27) but neglected to specify the level. What is the level of this confidence interval? Answer: The confidence interval has the form X  z / 2 s / n . We will solve for z/2, and then consult the z table to determine the value of . The upper confidence limit of 14.27 therefore satisfies the equation 14.27 = 12.68 + z/2(6.83/ (6 83/ 50 )). Th Therefore, f z/2 = 1.646. 1 646 F From th the z table, t bl we determine that /2, the area to the right of 1.646, is approximately 0 0.05. 05 The level is 100(1 - )%, )% or 90% 90%. Statistics-Berlin Chen 15

More About CI’s (1/2) ( ) • The confidence level of an interval measures the reliability of the method used to compute the interval • A level 100(1 ( - )% ) confidence interval is one computed p by a method that in the long run will succeed in in covering the population mean a proportion 1 -  of all the times that it is i used d • In practice, there is a decision about what level of confidence to use • This decision involves a trade-off, trade off, because intervals with greater confidence are less precise

Statistics-Berlin Chen 16

More About CI’s (2/2) ( )

100 samples

68% confidence inter intervals als 95% confidence fid iintervals t l 99.7% 99 7% confidence fid iintervals t l

Statistics-Berlin Chen 17

Probability y vs. Confidence • In computing CI, such as the one of diameter of pistons: (13 804 14.196), (13.804, 14 196) it is i tempting t ti to t say that th t the th probability b bilit that  lies in this interval is 95% • The term probability refers to random events, which can come out differently when experiments are repeated • 13.804 and 14.196 are fixed not random. The population mean is also fixed. The mean diameter is either in the interval or not – There is no randomness involved

• So, we say that we have 95% confidence that the population mean is in this interval – It is correct to say that a method for computing a 95% confidence interval has probability 95% of covering the population mean Statistics-Berlin Chen 18

Determining g Sample Size • Back to the example p of diameter of p pistons: We had a CI of (13.804, 14.196). – This interval specifies the mean to within 0.196. Now assume th t th that the iinterval t l iis ttoo wide id tto b be useful f l

Question: Assume that it is desirable to produce a 95% confidence interval that specifies the mean to within  0.1 – To do this, the sample size must be increased. The width of a CI is specified by  z / 2 / n. If we know  and  is specified, then we can find the n needed to get the desired width – F For our example, l the th z/2 = 1.96 1 96 and d th the estimated ti t d standard t d d deviation of the population is 1. So, 0.1 =1.96(1)/ n , then the n accomplishes this is 385 (always round up)

Statistics-Berlin Chen 19

One-Sided Confidence Intervals (1/2) ( ) • We are not always y interested in CI’s with an upper pp and lower bound • For example, example we may want a confidence interval on battery life. We are only interested in a lower bound on the battery life. There is not an upper bound on how long a battery can last (confidence interval =(low bound,∞ ) ) • With the same conditions as with the two-sided two sided CI, the level 100(1-)% lower confidence bound for  is X  z  X .

and the level 100(1-)% upper confidence bound for  is

X  z  X . Statistics-Berlin Chen 20

One-Sided Confidence Intervals (2/2) ( ) • Example: p One-sided Confidence Interval ((for Low Bound))

X  1.645 X , 

Statistics-Berlin Chen 21

Confidence Intervals for Proportions • The method that we discussed in the last section (Sec. ( 5.1) was for mean from any population from which a large sample is drawn • When the population has a Bernoulli distribution Y , this expression takes on a special form (the mean is equal to the success probability) – If we denote the success probability as p and the estimate for p as pˆ which can be expressed p by y X pˆ  n

n

: the sample size p items Yi that success X :number of sample

X  Y1  Y2    Yn

– A 95% confidence interval (CI) for p is pˆ  1.96

p (1  p ) p (1  p ) ˆ  p  p  1.96 . n n

Statistics-Berlin Chen 22

Comments • The limits of the confidence interval contain the unknown population proportion p – We have to somehow estimate this ( p ) • E.g., using pˆ

• Recent research shows that a slight modification of n and an estimate of p improve the interval – Define

n~  n  4 • And

X 2 ~ p ~ n Statistics-Berlin Chen 23

CI for p • Let X be the number of successes in n independent Bernoulli trials with success probability p , so that X ~ Binn,p  • Then a 100(1 - )% confidence interval for p is

~ p  z / 2

~ p (1  ~ p) . ~ n

– If the lower limit is less than 0, replace it with 0. – If the upper limit is greater than 1, replace it with 1

Statistics-Berlin Chen 24

Determine the Sample Size to having a specific CI for p • Sometimes we wish to compute p a necessary y sample p p available size without having a reliable estimate ~ – The quantity ~p 1  ~p  and n~ determine the width of the confidence fid iinterval t l p 1  ~ p  is greatest (=0.25) when ~ p  0.5 – The quantity ~

• Example 5.14: How large a sample is needed to guarantee that the width of the 95% confidence interval g of p will on larger than 0.08? CI has width  1.96 ~ p 1  ~ p  / n~  1.96 ~ p 1  ~ p  / n~  0.08  1.96 0.51  0.5 / n  4  0.08

with a conservative sample size

 n  147 Statistics-Berlin Chen 25

Small Sample CI for a Population Mean • The methods that we have discussed for a population p p mean previously require that the sample size be large • When the sample size is small, there are no general methods for finding CI’s • If the population is approximately normal, a probability distribution called the Student’s t distribution can be used to compute confidence intervals for a population mean

X  X

X



X 

/ n



X  s/ n Statistics-Berlin Chen 26

More on CI’s • What can we do if X is the mean of a small sample? p • If the sample size is small, s may not be close to , and normal. If we know nothing X may not be approximately normal about the population from which the small sample was drawn,, there are no easyy methods for computing p g CI’s • However, if the population is approximately normal, X will be approximately normal even when the sample size n is small. It turns out that we can use the quantity ( X   ) /(( s / n ) , but since s may y not be close to , this quantity instead has a Student’s t distribution with n-1 degrees of freedom, which we denote tn 1 Statistics-Berlin Chen 27

Student’s t Distribution (1/2) ( ) • Let X1,,…,X , n be a small ((n < 30)) random sample p from a normal population with mean  . Then the quantity (X  )

.

s/ n has a Student’s t distribution with n -1 degrees of freedom (denoted by tn-1).

• When n is large, the distribution of the above quantity is very close to normal, so the normal curve can be used, rather th than th the th Student’s St d t’ t

Cf. http://en.wikipedia.org/wiki/Student's_t‐distribution

Statistics-Berlin Chen 28

Student’s t Distribution (2/2) ( ) • Plots of p probability y density y function of student’s t curve for various of degrees

– The normal curve with mean 0 and variance 1 (z curve) is plotted for comparison – The t curves are more spread out than the normal, but the amountt off extra t spread d outt decreases d as the th number b off degrees d of freedom increases Statistics-Berlin Chen 29

More on Student’s t • Table A.3 called a t table, provides probabilities associated with the Student’s Student s t distribution

Statistics-Berlin Chen 30

Examples • Question 1: A random sample p of size 10 is to be drawn from a normal distribution with mean 4. The Student’s t statistic t  ( X  4) /( s / 10 ) is to be computed. What is the probability b bilit th thatt t > 1.833? 1 833? – Answer: This t statistic has 10 – 1 = 9 degrees of freedom. From the t table, table P(t > 1.833) 1 833) = 0.05 0 05

• Question 2: Find the value for the t14 distribution whose lower-tail probability is 0.01 – Answer: Look down the column headed with “0.01” to the row corresponding to 14 degrees of freedom. The value for t = 2.624. This value cuts off an area, or probability, of 1% in the upper tail. The value whose lower-tail p probability y is 1% is -2.624

Statistics-Berlin Chen 31

Student’s t CI • Let X1,…,Xn be a small random sample from a normal population with mean . Then a level 100(1 - )% CI for  is

X  t n1, / 2

s n

. T Two-sided id d CI

• To be able to use the Student’s t distribution for calculation and confidence intervals, you must have a sample that comes from a population that it approximately normal

Statistics-Berlin Chen 32

Other Student’s t CI’s • Let X1,…,Xn be a small random sample from a normal population with mean  – Then a level 100(1 - )% upper confidence bound for  i

X  t n1,

s n

.

one-sided CI

– Then a level 100(1 ( - )% ) lower confidence bound for  is

X  t n1,

s n

.

one-sided CI

• Occasionally a small sample may be taken from a normal population whose standard deviation  is known. In these cases, we do not use the Student’s t curve, because we are not approximating  with s. The CI to use here here, is the one using the z table, table that we discussed in the first section X   X  X   X

/ n

Statistics-Berlin Chen 33

Determine the Appropriateness of Using t Distribution (1/2) • We have to decide whether a population is approximately i t l normall b before f using i t distribution di t ib ti tto calculate CI – A reasonable way is construct a boxplot or dotplot of the sample – If these plots do not reveal a strong asymmetry or any outliers, the it most cast the Student’s t distribution will be reliable

• Example 5.9: Is it appropriate to use t distribution to calculate the CI for a population mean given a a random sample l with ith 15 ititems shown h b below l 580, 400, 428, 825, 850, 875, 920, 550, 575, 750, 636, 360, 590, 735, 950.

Yes !

Statistics-Berlin Chen 34

Determine the Appropriateness of Using t Distribution (2/2) • Example p 5.20: Is it appropriate pp p to use t distribution to calculate the CI for a population mean given a a random sample with 11 items shown below

38.43, 38.43, 38.39, 38.83, 38.45, 38.35, 38.43, 38.31, 38.32, 38.38, 38 50 38.50.

No !

Statistics-Berlin Chen 35

CI for the Difference in Two Means (1/2) ( ) • We also can estimate the difference between the means  X and  Y of two populations X and Y

– We can draw two independent random samples, one from X and the other one from Y , each of which respectively has sample means X and dY – Then construct the CI for  X  Y by determining the distribution of X  Y

• Recall the probability theorem: Let X and Y be independent, with X ~ N  X , X2  and Then









X  Y ~ N  X  Y ,  X2   Y2 – And

X  Y ~ N  X  Y ,  X2   Y2



Y ~ N Y , Y2



Statistics-Berlin Chen 36

CI for the Difference in Two Means (2/2) ( ) • Let X 1 , , X n X be a large g random sample p of size n X from  X and standard deviation  X , a population with mean   and let Y1 , , YnY be a large random sample of size nY f from a population l ti with ith mean  Y and d standard t d d deviation  Y . If the two samples are independent, then a level 100(1- )% CI for  X   Y is

X  Y  z / 2

 X2 nX

Two-sided CI



 Y2 nY

.

  0 .05

– When the values of  X and  Y are unknown, they can be replaced with the sample standard deviations s X and sY Statistics-Berlin Chen 37

CI for Difference Between Two Proportions p ((1/3)) • Recall that in a Bernoulli p population, p , the mean is equal q to the success probability (population proportion) p • Let X be the number of successes in n X independent Bernoulli trials with success probability p X , and let Y be the number of successes in nY independent Bernoulli Bi n X ,p X  trials with success probability pY , so that X ~ Bin and Y ~ BinnY ,pY  – The sample proportions  X p (1  p X pˆ X  ~ N  p X , X nX nX  Y nY X  nX

pˆ Y 

 pˆ X  pˆ Y

)   

following from the central limit theorem ( n X and nY are large) g )

 p Y (1  p Y )    ~ N  pY ,  nY    p (1  p X ) p (1  p Y ~ N  p X  p Y , X  Y nX nY 

)   

Statistics-Berlin Chen 38

CI for Difference Between Two Proportions (2/3) ( ) • The difference satisfies the following g inequality q y for 95% of all possible samples pˆ X  pˆ Y  1 . 96

p X (1  p X ) p (1  p Y )  Y nX nY

 p X  pY  pˆ X  pˆ Y  1 . 96

Two-sided CI

p X (1  p X ) p (1  p Y )  Y nX nY

– Traditionally in the above inequality, p X is replaced by pˆ X and p Y is replaced by pˆ Y

Statistics-Berlin Chen 39

CI for Difference Between Two Proportions (3/3) ( ) • Adjustment (In implementation): – Define

n~X  n X  2, n~Y  nY  2, ~ p X  ( X  1) / n~X , and ~ pY  (Y  1) / n~Y – The 100(1-)% CI for the difference p X  p Y is ~ pX  ~ pY  z / 2

~ p X (1  ~ pX ) ~ pY (1  ~ pY )  . nX nY

• If the lower limit of the confidence interval is less than -1, replace it with -1 • If the upper limit of the confidence interval is g greater than 1, replace it with 1 Statistics-Berlin Chen 40

Small-Sample CI for Difference Between Two Means (1/2) • Let X 1 , , X n X be a random sample of size n X from a normal population with mean  X and standard deviation  X , and let Y1 , , YnY be a random sample of size nY from a normal population p p with mean  Y and standard deviation  Y . Assume that the two samples are independent. If the populations do not necessarily have the same variance, a l level l 100(1 100(1- )% )% CI ffor  X   Y is i X  Y  t v, / 2

s X2 sY2  . n X nY

Two-sided CI

– The number of degrees of freedom, freedom  (pronounced “nu”) nu ), is given by (rounded down to the nearest integer) 2 v

s

2   s X2 s Y    n  n Y   X

2 X

  2



/ nX sY2 / nY  nX 1 nY  1

2

Statistics-Berlin Chen 41

Small-Sample CI for Difference Between Two Means (2/2) • If we further know the p populations p X and Y are known to have nearly the same variance  . Then a 100(1-)% CI for  X   Y is

X  Y  t nX  nY 2, / 2 s p

1 1  . n X nY

Two-sided CI

• Degrees of freedom: n X  nY  2 – The quantity s p is the pooled variance, variance used to approximate and given by 2 2 ( n  1 ) s  ( n  1 ) s X Y Y s 2p  X . n X  nY  2



• Don’t assume the population variance are equal just because the sample variance are close! Statistics-Berlin Chen 42

CI for Paired Data (1/3) ( ) • The methods discussed previously for finding CI’s on the basis of two samples have required the samples are independent • However, However in some cases cases, it is better to design an experiment so that each item in one sample is paired with an item in the other ((the variability y between the cars disappears) pp ) – Example: Tread wear of tires made of two different materials

include the variability y between cars and the variability in wear between tires Statistics-Berlin Chen 43

CI for Paired Data (2/3) ( ) • Let  X 1 , Y1 , ,  X n , Yn  be sample p p pairs. Let Di  X i  Yi . Let  X and  Y represent the population means for X and Y , respectively. We wish to find a CI for the  X   Y . Let diff difference L t  D representt the th population l ti  D   X   Y . It follows mean of the differences, then   that a CI for  D will also be a CI for  X   Y

• Now, the sample D1 , , Dn is a random sample from a population with mean  D , we can use one-sample one sample methods to find CIs for  D

Statistics-Berlin Chen 44

CI for Paired Data (3/3) ( ) • Let D1 , , Dn be a small random sample p ((n < 30)) of differences of pairs. If the population of differences is approximately normal, then a level 100(1-)% CI for  D is

D  t n 1, / 2

sD n

.

• If the sample size is large, a level 100(1-)% 100(1 )% CI for  D is

D  z / 2 D . – In practice,  D is approximated with

sD n Statistics-Berlin Chen 45

Summary y • We learned about large g and small CI’s for means • We also looked at CI’s for proportions p p • We discussed large and small CI’s CI s for differences in means • We explored CI’s for differences in proportions

Statistics-Berlin Chen 46

Suggest Documents