Stat 20: Intro to Probability and Statistics

Stat 20: Intro to Probability and Statistics Lecture 21: Intro to Hypothesis Testing Tessa L. Childers-Day UC Berkeley 30 July 2014 Recap Natural...
3 downloads 4 Views 133KB Size
Stat 20: Intro to Probability and Statistics Lecture 21: Intro to Hypothesis Testing

Tessa L. Childers-Day UC Berkeley

30 July 2014

Recap

Natural Questions

Hypothesis Testing

Example

Recap: From Samples to Boxes Spent the past 3 days reasoning from a sample to a box: Composition of box unknown Took SRS from box Computed sample statistic Used sample to estimate accuracy (via SE) Made a CI for the population parameter: “Based on the data collected, the reasonable range of values for the population parameter is from to . What if we want to answer the question, “Is a reasonable value for the population parameter, based on the data collected?” 2 / 27

Recap

Natural Questions

Hypothesis Testing

Example

By the end of this lecture...

You will be able to discuss the framework surrounding hypothesis tests: Understand the differences between null and alternative hypotheses Collect evidence Explain the role of a test statistic Define a p-value, and use it to decide whether or not to reject the null hypothesis

3 / 27

Recap

Natural Questions

Hypothesis Testing

Example

Coin Flipping

Say I flip a coin 200 times and: I observe 103 heads. Do you think the coin is fair? I observe 175 heads. Do you think the coin is fair?

4 / 27

Recap

Natural Questions

Hypothesis Testing

Example

Rolling a Die Say I roll a die 1000 times and find: the distribution of dots as below. Do you think the die is fair? Dots % Observed

1 17%

2 15%

3 27%

4 16%

5 2%

6 23%

the distribution of dots as below. Do you think the die is fair? Dots % Observed

1 16%

2 17%

3 17%

4 17%

5 17%

6 16%

5 / 27

Recap

Natural Questions

Hypothesis Testing

Example

Hair Color and Eye Color

Say I collect hair color and eye color data from 592 statistics students and observe the contingency table below. Are hair color and eye color related?

Hair Color Black Brown Red Blond

Brown 68 119 26 7

Eye Color Blue Hazel 20 15 84 54 17 14 94 10

Green 5 29 14 16

6 / 27

Recap

Natural Questions

Hypothesis Testing

Example

Hair Color and Eye Color (cont.) Say I collect hair color and eye color data from 592 statistics students and observe the contingency table below. Are hair color and eye color related?

Hair Color Black Brown Red Blond

Brown 37 34 35 37

Eye Color Blue Hazel 36 37 33 40 37 38 38 39

Green 39 37 39 36

7 / 27

Recap

Natural Questions

Hypothesis Testing

Example

Experimental Medication Say I am interested in testing a new allergy medication. I decide to use the method of comparison to decide if the new medicine is better than the old one. What is the gold standard in designing a study for this? I observe that 76% of people using the old medicine see improvement in their allergy symptoms, while 78% of people using the new medicine see improvement in their allergy symptoms. Is this enough evidence to use the new medicine?

8 / 27

Recap

Natural Questions

Hypothesis Testing

Example

(Chance) Error? Each of these scenarios feature a difference, usually between what we expect to observe and what we actually observe Until now, we’ve said that observed value − expected value = chance error Now, we instead say that observed value − expected value = error, and ask ourselves, “Is the error due to chance? or to something else?”

9 / 27

Recap

Natural Questions

Hypothesis Testing

Example

Hypotheses Hypothesis: a supposition or proposed explanation made as a starting point for further investigation Hypothesis Testing: using data collected to evaluate the hypothesis, and make an informed decision about the truthfulness of the hypothesis. A hypothesis is a statement, at its most general it says: “The world looks like “The world does not look like

” ”

10 / 27

Recap

Natural Questions

Hypothesis Testing

Example

Hypotheses (cont.)

We have two hypotheses. Generally the: null hypothesis states a specific view alternative hypothesis states that the null is wrong (many possible views, perhaps in a certain way)

We always make a decision about the null hypothesis

11 / 27

Recap

Natural Questions

Hypothesis Testing

Example

Example: Sasquatch lives...or does he?

We are interested in the existence (or lack thereof) of an ape-like, bipedal humanoid known as Sasquatch, or Bigfoot. We decide to test the following hypotheses: Null: There is no such thing as a Sasquatch Alternative: There is such a thing as a Sasquatch If we find a Sasquatch, what can we do? If we don’t find a Sasquatch, what can we do?

12 / 27

Recap

Natural Questions

Hypothesis Testing

Example

Example: WMDs exist...or do they?

We are interested in the existence (or lack thereof) of weapons of mass destruction, in the hypothetical country known Schmiraq. We decide to test the following hypotheses: Null: There are no WMDs in Schmiraq Alternative: There are WMDs in Schmiraq If we find WMDs in Schmiraq, what can we do? If we don’t find WMDs in Schmiraq, what can we do?

13 / 27

Recap

Natural Questions

Hypothesis Testing

Example

Hypothesis Testing

In these cases, an absence of evidence 6= evidence of an absence. In other words, we cannot prove the null hypothesis. We can either disprove it, or fail to disprove it.

14 / 27

Recap

Natural Questions

Hypothesis Testing

Example

Hypothesis Testing (cont.)

Another way of looking at null vs. alternative hypotheses: Rejecting the Null: There is a real difference between the world view in the null, and the data we saw Failing to Reject the Null: There is no real difference between the world view in the null, and the data we saw

15 / 27

Recap

Natural Questions

Hypothesis Testing

Example

Hypothesis Testing (cont.)

Commonly, hypothesis testing is about reasoning from a sample, to an unknown box. We are making inferences (guesses) about the box, without ever seeing it–we only see a simple random sample

16 / 27

Recap

Natural Questions

Hypothesis Testing

Example

Steps in Hypothesis Testing

1

State the hypotheses

2

Gather evidence

3

Compare evidence to null hypothesis

4

Decide whether or not to reject the null hypothesis

17 / 27

Recap

Natural Questions

Hypothesis Testing

Example

Step 1: State the hypotheses

A hypothesis is a statement about the box Null: “The box looks like



Alternative: “The box does not look like



Usually, talking about a population parameter from the box

18 / 27

Recap

Natural Questions

Hypothesis Testing

Example

Step 1: State the hypotheses (cont.) We know that there will usually be a difference between what we observe, and what we expect. observed value − expected value = error Ask ourselves, “Is this error due to chance? Or something else? Null: The difference between the sample and the box is due to chance error Alternative: The difference between the sample and the box is not due to chance error, but to the box being different

19 / 27

Recap

Natural Questions

Hypothesis Testing

Example

Step 2: Gather Evidence

This is done via sampling or repeated experimentation. We will usually use a SRS from a box model.

20 / 27

Recap

Natural Questions

Hypothesis Testing

Example

Step 3: Compare evidence to the null hypothesis

With numerical data, we compare a sample statistic to a population parameter, using a test statistic. It is calculated using: observed values expected values (from the null hypothesis) statistics near 0 indicate small differences between the null hypothesis and the data statistics far from 0 indicate large differences between the null hypothesis and the data

21 / 27

Recap

Natural Questions

Hypothesis Testing

Example

Step 4: Decide whether or not to reject the null hypothesis

Reject the null hypothesis when evidence significantly contradicts the world view put forth by the null. “There is enough evidence to reject the null hypothesis” Do not reject the null hypothesis when evidence does not significantly contradict the world view put forth by the null “There is not enough evidence to reject the null hypothesis” Can some evidence be stronger than others?

22 / 27

Recap

Natural Questions

Hypothesis Testing

Example

Example: GPAs I believe that Cal is populated by smart students, who work hard, and thus have high GPAs. I think the average GPA of all Cal students is 3.0. My office mate thinks that the professors are difficult graders, and thus the GPA is lower than 3.0. 1

State the hypotheses: Null: The average GPA of a Cal student is 3.0 Alt.: The average GPA of a Cal student is less than 3.0

2

Gather evidence: Take a SRS of size 1,000 students from Cal, and record their GPAs. Observe an average GPA of 2.7.

3

Compare evidence to null hypothesis: Calculate a test statistic. 23 / 27

Recap

Natural Questions

Hypothesis Testing

Example

Example: GPAs (cont.)

4

Decide whether or not to reject the null hypothesis: Let’s pretend we reject, and conclude that the difference between the observed GPA (2.7) and the hypothesized GPA (3.0) is not due to chance.

Another SRS of size 1,000 students from Cal is done, and observes an average GPA of 2.5. Should this SRS reject the null hypothesis? Is this evidence stronger or weaker than the evidence from the first SRS? Why?

24 / 27

Recap

Natural Questions

Hypothesis Testing

Example

Strength of Evidence To quantify the difference between these two evidence, we use observed levels of significance All observed levels of significance are between 0% and 100% (or 0 and 1) Lower level =⇒ stronger evidence against null hypothesis Higher level =⇒ weaker evidence against null hypothesis Called the “p-value” Represents the chance the test statistic is as or more extreme than the one observed, assuming the null hypothesis is true Does not represent the chance the null hypothesis is true

25 / 27

Recap

Natural Questions

Hypothesis Testing

Example

Strength of Evidence (cont.)

Customarily: If p-value < 5%, evidence is (statistically) significant. “The and thus we reject the null hypothesis due to p-value = statistically significant evidence.” If p-value < 1%, evidence is highly (statistically) significant. “The p-value = and thus we reject the null hypothesis due to highly (statistically) significant evidence.”

26 / 27

Recap

Natural Questions

Hypothesis Testing

Example

Important Takeaways Hypothesis tests answer questions about the makeup of the world All conclusions refer to the null hypothesis We cannot prove the null hypothesis. We can only fail to disprove it. With box models, hypothesis tests decide whether difference between observed and expected are due to chance Some evidence is stronger than others, quantified by p-value P-values do not represent the chance that the null hypothesis is true Next time: Developing hypotheses, calculating test statistics, finding p-values. 27 / 27