STATISTICS BEYOND THE CLASSROOM

STATISTICS BEYOND THE CLASSROOM Barry Monk William Navidi Don Brown Middle Georgia State College Colorado School of Mines Middle Georgia State Co...
6 downloads 0 Views 1MB Size
STATISTICS BEYOND THE CLASSROOM Barry Monk

William Navidi

Don Brown

Middle Georgia State College

Colorado School of Mines

Middle Georgia State College

CHALLENGES There is a fair amount of research that indicates • Many students view a statistics course as an obstacle • Many students do not (at least initially) see the value in the subject • Negative attitudes towards statistics correlate with lower performance Students’ Attitudes Towards Statistics Across the Disciplines: A Mixed Methods Approach (2012) James D. Griffith, Lea T. Adams, Lucy L. Gu, Christian L. Hart, and Penney Nichols-Whitehead

THE SOLUTION • It’s complicated • Important to discuss why statistics should be learned • Focus students’ attention on conceptual understanding and applications rather than formula manipulation • Emphasize relevance to students’ fields • Engage students with real-world problems • Show how statistics is used Beyond the Classroom

ACCESSIBLE TO THE BEGINNER • Statistics, perhaps more than any other branch of mathematics, provides a wealth of real-life applications accessible to beginning students

IN LINE WITH GAISE 1. Emphasize statistical literacy and develop statistical thinking 2. Use real data 3. Stress conceptual understanding, rather than mere knowledge of procedures 4. Foster active learning in the classroom 5. Use technology for developing conceptual understanding and analyzing data 6. Use assessments to improve and evaluate student learning

EXAMPLE 1 The Importance of Checking Assumptions • Aluminum cans must withstand pressures up to 90 pounds per square inch. • A large shipment of cans is tested by sampling a few cans, applying force until they fail. • The proportion of cans in the shipment that will fail at a pressure of 90 pounds per square inch must be estimated. Can Pressure

1

2

3

4

5

6

7

8

9

10

95

96

98

99

99

100

101

101

103

104

EXAMPLE 1 The Importance of Checking Assumptions Can Pressure

1

2

3

4

5

6

7

8

9

10

95

96

98

99

99

100

101

101

103

104

• None of the sampled cans are defective. But this is not enough to conclude that the proportion of defective cans is less than 1 in 1000. • We compute:

Sample Mean = 99.6 Sample Standard Deviation = 2.84

• We assume the pressures are normally distributed with mean 99.6 and standard deviation 2.84.

EXAMPLE 1 The Importance of Checking Assumptions • We estimate the proportion of defective cans in the shipment to be 0.0004.

• Since 0.0004 < 0.001, we accept the shipment.

EXAMPLE 1 The Importance of Checking Assumptions Another sample of cans was taken. Can Sample 1 Sample 2

1

2

3

4

5

6

7

8

9

10

95

96

98

99

99

100

101

101

103

104

96

97

99

100

100

100

101

103

103

120

• Note that the cans in the new sample are stronger than the cans in the original sample. • For the new sample, we compute: Sample Mean = 101.9 Sample Standard Deviation = 6.74 • We assume the pressures are normally distributed with mean 101.9 and standard deviation 6.74.

EXAMPLE 1 The Importance of Checking Assumptions • We estimate the proportion of defective cans in the shipment to be 0.0384.

• This is larger than 0.001. We reject the shipment. • Why is the shipment of stronger cans rejected?

EXAMPLE 1 The Importance of Checking Assumptions

• The rejected sample contains an outlier. Therefore, the assumption of normality does not hold.

EXAMPLE 1 The Importance of Checking Assumptions Checking assumptions is important. • A common example is that polls state a margin of error based on the assumption of simple random sampling. • In fact, most polls use a version of stratified sampling, matching the sample to the population with respect to age, race, gender, political affiliation, etc…

EXAMPLE 2 Skewed Populations to Illustrate Concepts Skewed populations can often be used to illustrate concepts • In a www.Slate.com investigation, 10,000 videos uploaded to YouTube were tracked for one month to determine the number of hits each video received

http://www.slate.com/articles/technology/webhead/2009/07 /will_my_video_get_1_million_views_on_youtube.html

EXAMPLE 3 Driving it Home (for the Holidays) A study conducted by the University of Oklahoma examined the weight changes over the Thanksgiving holiday break in college students. • n = 94 students in the sample • Pre-Thanksgiving: Mean Weight = 72.1 kg Standard Deviation = 14.0 kg • Post-Thanksgiving: Mean Weight = 72.6 kg Standard Deviation = 14.3 kg

EXAMPLE 3 Driving it Home (for the Holidays) Some media stories suggest that the average person gains 7 to 10 lbs (3.1 to 4.5 kg). In surveys, people say they gain, on average, about 5 lbs (2.27 kg).

EXAMPLE 4 Simpson’s Paradox • An apparent relationship between two variables can disappear when a third variable is considered. • A famous example involves the graduate admissions process at UC Berkeley, which was suspected of discriminating against female applicants. Applied

Accepted

% Acc.

Male

2691

1198

44.5

Female

1835

557

30.3

EXAMPLE 4 Simpson’s Paradox Admissions data were collected for each department, to discover which departments were responsible for the difference in admission rates. * P < 0.05 Applied

Accepted

% Acc.

Applied

Accepted

% Acc.

Male

825

512

62.1

Male

560

353

63.0

Female

108

89

82.4*

Female

17

8

47.0

Male Female

325

120

36.9

417

138

33.1

593

202

34.1

Male Female

375

131

34.9

Male Female

191

53

27.7

373

22

27.7

393

94

23.9

Male Female

341

24

23.9

EXAMPLE 4 Simpson’s Paradox • Two hospitals, A and B, are being evaluated with regard to a particular procedure. Total

Successful

Percent

A

100

57

57

B

100

43

43

• It appears that hospital A is more successful.

EXAMPLE 4 Simpson’s Paradox Some patients arrive in good condition, some in poor condition. Here are the results broken down by condition: Total

Successful

Percent

A

65

51

78.4

B

35

29

82.9

Total

Successful

Percent

35

6

17.1

65

14

21.5

Total

Successful

Percent

A

100

57

57

B

100

43

43

GOOD

POOR A B TOTAL

EXAMPLE 5 Sampling Bias Commonly used sampling methods: • Simple Random Sampling • Cluster Sampling • Stratified Sampling • Systematic Sampling • Samples of Convenience • Voluntary Response Sampling All of the these are valid except for voluntary response sampling

EXAMPLE 5 Sampling Bias Every survey involves voluntary response sampling, because people can choose not to respond. This is often referred to as non-response bias.

EXAMPLE 5 Sampling Bias Libby Woodstove Study • Carried out in Libby, Montana • Most homes are heated with wood stoves, which emit particulate pollution • The study measured levels of particulate pollution • Schoolchildren were given questionnaires about their health to take home • Parents filled out questionnaires and children returned them in school • Parents who did not return questionnaires were mailed a second copy

EXAMPLE 5 Sampling Bias Responded First Time

Responded Later

Date

PM

Responses

Wheezed

Date

PM

Responses

Wheezed

Mar 5

19.815

3

0

Apr 12

14.422

10

1

Mar 6

19.885

72

9

Apr 13

14.418

9

1

Mar 7

20.006

69

5

Apr 14

14.405

8

0

Mar 8

19.758

30

1

Apr 15

14.141

3

0

Mar 9

19.827

44

7

Apr 16

13.910

4

0

Mar 10

19.686

31

1

Apr 17

13.951

2

0

Mar 11

19.823

38

3

Apr 18

13.545

2

0

Mar 12

19.697

66

5

Apr 20

13.326

3

0

Mar 13

19.505

42

4

Apr 22

13.154

2

0

Mar 14

19.359

31

1

Mar 15

19.348

19

4

Mar 16

19.318

3

1

Mar 17

19.124

2

0

EXAMPLE 5 Sampling Bias PM levels are higher for those who responded the first time. Wheezed

Responses

Percentage

High PM

41

450

9.1%

Low PM

2

43

4.7%

Wheezed

Responses

Percentage

41

450

9.1%

2

43

4.7%

Responded First Time Responded Later

EXAMPLE 5 Sampling Bias Conclusions • High-exposure children were more likely to wheeze • Prompt responders were more likely to wheeze • People with symptoms are generally more eager to participate in studies than those without, so may be more likely to respond promptly • Non-response bias (or late-response bias) potentially affects many survey results

EXAMPLE 6 King Tut’s Curse King Tutankhamun’s tomb was opened on November 29, 1922. Shortly thereafter, best-selling novelist Marie Corelli wrote an imaginative letter published in newspapers around the world which quoted that an ancient text assured that “the most dire punishment follows any rash intruder into a sealed tomb.”

On April 5, 1923, Lord Carnarvon, financial backer of the excavation team who was present when the tomb was opened, died in Cairo of a mosquito bite that became infected.

EXAMPLE 6 King Tut’s Curse George Jay Gould, a visitor to the tomb, died on May 16, 1923 after developing a fever following his visit.

Prince Ali Kamel Fahmy Bey of Egypt died in July of 1923.

Lord Carnarvon’s half-brother died in September of 1923. The following years brought even more deaths including a radiologist who x-rayed Tutankahamun’s mummy, a member of the excavation team, another of Carnarvon’s half-brothers, and many others – often under mysterious circumstances.

EXAMPLE 6 King Tut’s Curse In 2002, a study done by Dr. Mark Nelson investigated the mummy’s curse. The study involved Westerners who were “exposed to the curse” (n1 = 25) and other Westerners in Egypt at the same time (n2 = 11). Exposure To The Curse: •

Present at the breaking of the seal of the tomb



Present at the opening of the sarcophagus



Present at the opening of the coffins



Those who examined the mummy

EXAMPLE 6 King Tut’s Curse Exposed

Not Exposed

P-Value

Age at Death

70.0 (12.4)

75.0 (13.0)

0.87

Survival (years)

20.8 (15.2)

28.9 (13.6)

0.95



Intuitive meaning of P-value



What kind of study was this?



What assumptions needed to be made in order to do this study?



Are there possible confounders?

Suggest Documents