Statistical Tests. Plant Before After

Statistical Tests Example: Suppose that of 100 applicants for a job 50 were women and 50 were men, all equally qualified. Further suppose that the com...
0 downloads 0 Views 106KB Size
Statistical Tests Example: Suppose that of 100 applicants for a job 50 were women and 50 were men, all equally qualified. Further suppose that the company hired 2 women and 8 men. Question: ◦ Does the company discriminate against female job applicants? ◦ How likely is this outcome under the assumption that the company does not discriminate?

Example: ◦ Study success of new elaborate safety program ◦ Record average weekly losses in hours of labor due to accidents before and after installation of the program in 10 industrial plants Plant 1 2 3 4 5 6 7 8 9 10 Before 45 73 46 124 33 57 83 34 26 17 After 36 60 44 119 35 51 77 29 24 11 Question: ◦ Has the safety program an effect on the loss of labour due to accidents? ◦ In 9 out of 10 plants the average weekly losses have decreased after implementation of the safety program. How likely is this (or a more extreme) outcome under the assumption that there is no difference before and after implementation of the safety program.

Testing Hypotheses I, Feb 16, 2004

-1-

Statistical Tests Example: Fair coin Suppose we have a coin. We suspect it might be unfair. We devise a statistical experiment: ◦ Toss coin 100 times ◦ Conclude that coin is fair if we see between 40 and 60 heads ◦ Otherwise decide that the coin is not fair Let p be the probability that the coin lands heads, that is,

P(Xi = 1) = θ

and

P(Xi = 0) = 1 − θ.

Our suspicion (“coin not fair”) is a hypothesis about the population parameter θ (θ 6= 21 ) and thus about P. We emphasize this dependence of P on θ by writing Pθ . Decision problem: Null hypothesis H0 : X ∼ Bin(100, 12 ) Alternative hypothesis Ha : X ∼ Bin(100, θ), θ 6=

1 2

The null hypothesis represents the default belief (here: the coin is fair). The alternative is the hypothesis we accept in view of evidence against the null hypothesis. The data-based decision rule reject H0 if X ∈ / [40, 60] do not reject H0 if X ∈ [40, 60] is called a statistical test for the test problem H0 vs. Ha .

Testing Hypotheses I, Feb 16, 2004

-2-

Statistical Tests Example: Fair coin (contd) Note: It is possible to obtain e.g. X = 55 (or X = 65) ◦ with probability 0.048 (resp. 0.0009) if p = 0.5 ◦ with probability 0.048 (resp. 0.0049) if p = 0.6 ◦ with probability 0.0005 (resp. 0.047) if p = 0.7 Bin(100,0.5)

0.10

Accept H0: p ≠ 0.5 Reject H0: p ≠ 0.5

0.08

p(x)

0.06

0.04

0.02

0.00

20

25

30

35

40

45

50

55

60

65

70

75

80

x

Bin(100,0.6)

0.10

Accept H0: p ≠ 0.5 Reject H0: p ≠ 0.5

0.08

p(x)

0.06

0.04

0.02

0.00

20

25

30

35

40

45

50

55

60

65

70

75

80

x

Bin(100,0.7)

0.10

Accept H0: p ≠ 0.5 Reject H0: p ≠ 0.5

0.08

p(x)

0.06

0.04

0.02

0.00

20

25

30

35

40

45

50

55

60

65

70

75

80

x

Testing Hypotheses I, Feb 16, 2004

-3-

Types of errors Example: Fair coin (contd) It is possible that the test (decision rule) gives a wrong answer: ◦ If θ = 0.7 and x = 55, we do not reject the null hypothesis that the coin is fair although the coin in fact is not fair. ◦ If θ = 0.5 and x = 65, we reject the null hypothesis that the coin is fair although the coin in fact is fair. The following table lists the possibilities: Decision Reject H0 Accept H0

H0 true H0 false type I error correct decision correct decision type II error

Definition (Types of error) ◦ If we reject H0 when in fact H0 is true, this is a Type I error. ◦ If we do not reject H0 when in fact H0 is false, this is a Type II error.

Testing Hypotheses I, Feb 16, 2004

-4-

Types of errors Question: How good is our decision rule? For a good decision rule, the probability of committing an error of either type should be small. Probability of type I error: α If the null hypothesis is true, i.e. θ = 12 , then

Pθ (reject H0) = Pθ (X ∈/ [40, 60]) = 1 − Pθ (X ∈ [40, 60]) =1−

 100 60  X 100 1 x

x=40

2

= 0.035.

Thus the probability of a type I error, denoted as α, is 3.5%. Probability of type II error: β(θ) If the null hypothesis is false and the true probability of observing “head” is θ with θ 6= 12 , then

Pθ (accept H0) = Pθ (X ∈ [40, 60])  60  X 100

=

x=40

x

θx (1 − θ)n−x

Thus, the probability of an error of type II depends on θ. It will be denoted as β(θ).

Testing Hypotheses I, Feb 16, 2004

-5-

Power of Tests Question: How good is our test in detecting the alternative? Consider the probability of rejecting H0

Pθ (reject H0) = Pθ (X ∈/ [40, 60]) = 1 − Pθ (accept H0 ) = 1 − β(θ). Note: ◦ If θ =

1 2

this is the probability of committing a error of type I:   1 1−β =α 2

◦ If θ >

1 2

this is the probability of correctly rejecting H0 .

Definition (Power of a test) We call 1 − β(θ) the power of the test as it measures the ability to detect that the null hypothesis is false.

1.0

1 − β(θ)

0.8 0.6 0.4 reject if X ∉ [40,60]

0.2 0.0 0.0

0.1

Testing Hypotheses I, Feb 16, 2004

0.2

0.3

0.4

0.5 θ

0.6

0.7

0.8

0.9

1.0

-6-

Significance Tests Idea: minimize probability of committing an error of type I and II Different probabilities of type I error 1.0

1 − β(θ)

0.8 0.6 0.4

reject if X ∉ [40,60] reject if X ∉ [38,62] reject if X ∉ [42,58]

0.2 0.0 0.0

0.1

0.2

0.3

0.4

0.5 θ

0.6

0.7

0.8

0.9

1.0

Note: If we decrease the probability of a type I error, ◦ the power of the test, 1 − β(θ) decreases as well and ◦ the probablity of a type II error increases. Problem: cannot minimize both errors simultaneously Solution: ◦ choose fixed level α for probability of a type I error ◦ under this restriction find test with small probability of a type II error Remark: ◦ you do not have to do this minimization yourself. ◦ all tests taught in this course are of this kind. Definition A test of this kind is called a significance test with significance level α.

Testing Hypotheses I, Feb 16, 2004

-7-

Suggest Documents