## Chapter 1 Introduction to Statistics

Chapter 1 Introduction to Statistics Chapter Problem: Why was the Literary Digest poll so wrong? Literary Digest successfully predicted elections (16,...
Chapter 1 Introduction to Statistics Chapter Problem: Why was the Literary Digest poll so wrong? Literary Digest successfully predicted elections (16, 20, 24, 28 and 32) 1936 Presidential Election (between Alf Landon and Franklin Roosevelt) Literary sent 10 million ballots ( 57% for Landon) Roosevelt actually received 61% of the votes, Landon 37%. Literary Digest suffered a humiliating defeat and soon went out business. Gallup used a much smaller sample poll of 50,000 subjects, predicted correctly. 70% Roosevelt actually received 61% Of the popular votes.

60% Percentage for Roosevelt

• • • • • •

50% 40% 30% 20% 10% 0%

Literary Digest Poll

Gallup Poll

1

1.1 Review and Preview Definitions • Data are observations (such as measurements, genders, …) • Statistics is a science of planning, collecting data, …, draw conclusion • A Population is a complete collection of all elements • A census is the collection of data from every member from the population • A sample is a sub collection of members selected from a population Key concepts: • Sample data must be collected in a appropriate way • Otherwise, the data may be complete useless 2

1.2 Statistical Thinking Basic principles of statistical thinking used throughout this book • Context of the data • Source of the data • Sampling method • Conclusion • Practical implication

3

1.2 Statistical Thinking 1. Context of the data Consider the data in table 1-1 (from data set 3 in appendix B) table 1-1 Data Used for Analysis x 56 67 57 60 64 y 53 66 58 61 68

E.g.1 Context for table 1-1 • • • •

Weights (in kg) of Rutgers students x values are measured in September of freshman year y values are measured in April of the following spring semester Real data! “Changes in Body Weight and Fat Mass of Men and Women in the first year of college: A Study of the ‘Freshman 15,’” Journal of American College Health • Goal: determine whether college students actually gained 15 lb in freshmen year. 4

1.2 Statistical Thinking 2. Source of the data E.g.2 Source of the Data for table 1-1 (need to be objective, not biased) • Reputable researchers from the Department of Nutritional Sciences at Rutgers University compiled the measurements in table 1-1. – No incentive to distort or spin results – They have nothing to gain – Not paid

• We can be confident that these researchers are unbiased and they did not distort results

5

1.2 Statistical Thinking 3. Sampling method E.g.3 Sampling used for table 1-1 (need to be random) • Weights in table 1-1 are from large sample data set 3 appendix B • 217 students participated in September • All invited for a follow-up in the spring, 67 of those responded and were measured in the last two weeks of April • Voluntary response sample • The research wrote that “the sample obtained was not random and may have introduced self-selection bias”, and that “only those students who felt comfortable enough with their weights to be measured both times.”

6

1.2 Statistical Thinking Conclusion E.g.4 Conclusion from Data in table 1-1 • Published • Weight gain occurred during freshman year • For the small non-random group studied, the weight gain is less than 15 lb, and this amount is not universal • They concluded that “Freshman 15” weight gain is a myth

7

1.2 Statistical Thinking 5. Practical implications E.g.5 Practical Implication from Data in table 1-1 In addition to conclusions of the statistical analysis, we should also identify any practical implication of the results. • They wrote that “it is perhaps most important for students to recognize that seemingly minor and perhaps even harmless changes in eating or exercise behavior may result in large changes in weight and body fat mass over an extended period of time.”

8

1.2 Statistical Thinking 6. Statistical significance versus practical significance E.g.6 Statistical significance versus practical significance Atkins weight loss program. 40 subjects, mean weight loss is 2.1 lb. More discussions on statistical significance throughout the book.

9

1.2 Statistical Thinking 6. Statistical significance E.g.7 Statistical significance A company developed MicroSort, which supposedly increases the chances of a couple having a baby girl. In a preliminary test, researchers located 14 couples who wanted baby girls. After using MicroSort, 13 of them had girls, and one had a boy. There two conclusions: • MicroSort is not effective and 13 girls in 14 births occurred by chance • MicroSort is effective as claimed by the company Statistician thinks: • If MicroSort has no effect, then there is about 1 chance in 1000 of getting results above. • Because the likelihood is so small, statistician conclude that the result is statistically significant. So it appears MicroSort is effective. 10

1.2 Statistical Thinking 6. Statistical significance E.g.8 Statistical significance A company developed MicroSort, which supposedly increases the chances of a couple having a baby girl. In a preliminary test, researchers located 14 couples who wanted baby girls. After using MicroSort, 8 of them had girls, and one had a boy. There two conclusions: • MicroSort is not effective and 13 girls in 14 births occurred by chance • MicroSort is effective as claimed by the company Statistician thinks: • If MicroSort has no effect, then there is about two chances in 5 of getting results above. • Unlike the one chance in 1000, two chances in 5 indicate that the results could easily occur by chance. That indicate that the result of 8 girls in 14 births is not statistically significant. 11

1.3 Types of Data Definition – parameter and statistic • A parameter is a numerical measurement describing some characteristic of a population • A statistic is a numerical measurement describing some characteristic of a sample • Example 1 – Parameter: there are 100 senators in the 109th Congress, and 55% of them are republican. 55% is a parameter, since it is based on the entire population of 100 senators. – Statistic: In 1936, Literary Digest polled 2.3 million adults in the U.S., and 57% of them said that they would vote for Alf Landon. 57% is a statistic, because it is based on a sample.

12

1.3 Types of Data Definition – quantitative data and qualitative data • Quantitative data consist of numbers representing counts or measurements. • Categorical (or qualitative or attribute) data consist of names or labels that are not numbers representing counts or measurements. • Example 2 – Quantitative data: the ages (in years) of survey repondents – Categorical data: the political party affiliation (democratic, republican, independent, other) of survey respondents – Categorical data: the numbers 24, 28, 17, 54, and 31 are sewn on the shirts of the LA Lakers starting basketball team. Theses numbers are substitutes for names.

13

1.3 Types of Data Definition – discrete data and continuous data Quantitative data can be further divided into discrete data and continuous data • Discrete data result when the number of possible values is either finite number or a “countable” number –

E.g.3 the number of eggs

• Continuous data result from infinitely many possible values that correspond to some continuous scale. –

E.g.3 the amount of milk from cows

14

1.3 Types of Data Four levels of measurements • Nominal – data that consist of names, labels or categories only. – E.g.4 yes/no/undecided response from a survey – E.g.4 Political party affiliation

• Ordinal – data has order, but no difference. – E.g.5 Course grades (A, B, C, D, or F) – E.g.5 Ranks (e.g., first, second, third, …)

• Interval – data has difference but no starting value – E.g.6 Temperature – E.g.6 Years

• Ratio – data has natural zero starting point – E.g.7 Distances – E.g.7 Prices

15

1.3 Types of Data Four levels of measurements •

Most difficulty arises with the distinction between the interval and ratio –

Hint: use “ratio test” – consider two quantities where one number is twice the other, and ask whether “twice” can be used to correctly describe the quantities.

16

1.4 Critical Thinking Key concept – common sense, interpretation

• Two main source of deception – Intentional (Evil intent) – Unintentional (error)

17

1.4 Critical Thinking Example of bad samples

• Definition – a voluntary response sample (or selfselected sample) is one in which the respondents themselves decide whether to be included. • E.g.1 Newsweek.msnbc.com Napster poll. – People with strong opinions are likely to participate – Other example, internet, mail-in polls, call-in polls

• E.g.2 Literary Digest. Ballots were sent to magazine subscribers, registered car owners, and those who use telephones. 18

1.4 Critical Thinking Publication bias: Tends to publish positive results. Reported result: when collecting data from people, • It is better to take measurements yourself • E.g.3 Voting behavior. When 1002 eligible voters were surveyed – 70% said that they voted in a recent presidential election (ICR Research) – Record show only 61% voted.

Small samples • E.g.4 Children’s Defense Fund published Children Out of School in America. – Among secondary school students, 67% were suspended at least 3 times – Sample size = 3 students 19

1.4 Critical Thinking Percentages • Can be misleading – E.g.5 In referring to lost baggage, Continental Airlines Ads claiming this was “an area where we’ve already improved 100% in the last six months.” – 100% improvement means no baggage is lost

20

1.4 Critical Thinking Loaded Questions • Intentionally worded to elicit a desired response Example 6 – 97% yes: “should the President have the line item veto to eliminate waste?” – 57% yes: “should the President have the line item veto, or not?”

Order of Questions Example 7 • Would you say that traffic contributes more or less to air pollution than industry? 45% blame traffic, 27% blame industry • Would you say that industry contributes more or less to air pollution than traffic? 24% blame traffic, 57% blame industry 21

1.4 Critical Thinking Nonresponse • Refuse to answer or unavailable • Refusal rate has been growing in recent years • View of world around them is different than one was Missing data • Sometimes random • Special factor at other times • Census suffers from missing data

22

1.4 Critical Thinking Correlation and Causality • Statistical association between two variables • E.g. wealth and IQ • Correlation does not imply causality Self-Interest Study • Sponsored by parties with interests to promote products, etc • Kiwi Brands, a maker of scuffed shoes • Pharmaceutical companies pay doctor …

23

1.4 Critical Thinking Precise Numbers • There are now 103,215,027 households in the U.S. • Better to say, the number of households is about 103 millions Deliberate Distortion • Hertz sued Avis (false advertising by Avis based on the survey) • Claimed that Avis was the winner in a survey of people who rent cars.

24

1.5 Collecting Sample Data Key concept – simple random samples • If sample data are not collected in an appropriate way, the data may be so completely useless that no amount of statistical torturing can salvage them. • Throughout this book – Will use a variety of different statistical procedures – Often requires that sample selected is a simple random sample

25

1.5 Collecting Sample Data – Part 1 Part 1: Basics of collecting data Definition – observational studies and experiments. • In an observational study, we observe and measure specific characteristics, but we don’t attempt to modify the subjects being studied - e.g. Gallup Poll. • In an experiment, we apply some treatment and then proceed to observe its effects on the subjects. (subjects in experiments are called experimental units) • Example 1 – Observational study: a poll in which subjects are surveyed, but they are not given any treatment. – Experiment: In the largest public health experiment, 200,745 children were given a treatment consisting of the Salk vaccine, while 201,229 other children were given a placebo. 26

1.5 Collecting Sample Data – Part 1 Common sampling methods: • A simple random sample of n subjects is selected in such a way that every possible sample of the same size n has the same chance of being selected. • In a random sample members from the population are selected in such a way that each individual member has an equal chance of being selected. • A probability sample involves selecting members from a population in such a way that each member has a known (but not necessarily the same) chance of being selected. • Example 2 Sampling senators. Create 50 index cards. There is a state name on each card. We mix 50 card in a bowl and then select one card. If we consider the two senators to be a sample, is this result a random sample? Simple random sample? Probability 27 sample? Random sample, NOT simple random sample, probability sample.

1.5 Collecting Sample Data – Part 1 Common sampling methods: • Systematic sampling – select a starting point and then select every kth element (e.g. every 50th) • Convenience sampling – use results that are very easy to get • Stratified sampling – subdivide the population into at least two different subgroups (or strata) so that subjects within the same subgroup share the same characteristics (e.g. gender), then draw a sample from each subgroup (or stratum) • Cluster sampling – first divide the population into sections (or clusters), then randomly select some of those clusters, and choose all the members from those selected clusters 28

1.5 Collecting Sample Data – Part 1 Common sampling methods: • Neither stratified sampling nor cluster sampling satisfies the simple random sample requirement

29

1.5 Collecting Sample Data – Part 1 Example 3. Multistage sample design The U.S. government’s unemployment statistics are based on surveyed households. U.S. Census Bureau and the Bureau of Labor conduct a survey called the Current Population Survey. This survey obtains data describing such factors as unemployment rates, college enrollments, and weekly earning amounts. The survey incorporates a multistage sample design, roughly following these steps:

30

1.5 Collecting Sample Data – Part 1 Example 3 (continue) 1. The entire U.S. is partitioned into 2007 different regions called primary sampling units (PSU). The primary sampling units are metropolitan areas, large counties, or groups of smaller counties 2. For the Current Population Survey, 792 of the PSU are used. (All of the 432 PSU with the largest populations are used, and the other 360 PSU are randomly selected from the other 1575.) 3. Each of the 792 selected PSU’s is partitioned into blocks, and stratified sampling is used to select a sample of blocks 4. In each selected block, clusters of households that are close to each other are identified. Clusters are randomly selected, and all households in the selected clusters are interviewed.

31

1.5 Collecting Sample Data – Part 2 Part 2: Beyond the Basics of Collecting Data • Observational study • Experiment

32

1.5 Collecting Sample Data – Part 2 Observational Study Observe and measure, but do not modify

Past period of time

Retrospective (or case-control) study: Go back in time to collect data over some past period

One point in time Cross-sectional study: Data are measured at one poing in time

Forward in time

Prospective (or longitudinal or cohort) study Go forward in time and observe group sharing common factors, such as smokers And nonsmokers

Figure 1-3. Types of Observational Studies 33

1.5 Collecting Sample Data – Part 2 Part 2: Beyond the Basics of Collecting Data Definition – in observational study • In a cross-sectional study, data are observed, measured and collected at one point in time. • In a retrospective (or case-control) study, data are collected from the past by going back in time • In a prospective (or longitudinal or cohort) study, data are collected in the future from groups sharing common factors (called cohorts)

34

1.5 Collecting Sample Data – Part 2 Design of experiment – Start with example. E.g.4 The Salk Vaccine Experiment In 1954, a large-scale experiment was designed to test the effectiveness of the Salk vaccine in preventing polio. • 200,745 children were given a treatment consisting of Salk vaccine injection, while a second group of 201,229 children were injected with a placebo that contained no drug. • The children being injected did not know whether they were getting the Salk vaccine or the placebo. • Children were assigned to the treatment or placebo group through a process of random selection, equivalent to flipping a coin. • Among the children given the Salk vaccine, 33 later developed paralytic polio, but among the children given the placebo, 115 later developed paralytic polio. 35

1.5 Collecting Sample Data – Part 2 Experiment • Randomization • Replication: Replication is the repetition of an experiment on more than one subject. – Repeating the experiment so that result(s) can be verified. – Use large sample size, and good sampling method – e.g. in Salk Vaccine test, about 200,000 children in each group

• Blinding: – A technique in which the subject does not know whether he/she is receiving treatment or placebo. – Blinding allows us to determine whether the treatment effect is significantly different the placebo effect. – Polio experiment was double-blinded 36

1.5 Collecting Sample Data – Part 2 Experiment • Confounding occurs in an experiment when you are not able to distinguish among the effects of different factors Bad experimental design: treat all women subjects, and don’t treat men. (Problem: we don’t know if effects are due to sex or to treatment)

Completely randomized experimental design: Use randomness to determine who gets the treatment.

Randomized block design: 1. Form a block of women and a block of men 2. Within each block, randomly select subjects to be treated.

Treatment Group: Women

Block of Women

Treat all women subjects

Treat randomly selected women Block of Men

Placebo Group: Men

Give all men a Placebo a

Treat these randomly selected subjects b

Treat randomly selected men c

Figure 1-4 Controlling Effect of a Treatment Variable

37

1.5 Collecting Sample Data – Part 2 Experiments Confounding • In Figure 1-4(a), where confounding occur when the treatment group of women shows strong positive results. We cannot determine whether the treatment or the sex of the subjects causes the positive results. • Salk vaccine experiment illustrated one method – the completely randomized experimental design. • Four methods used to control effects of variables.

38

1.5 Collecting Sample Data – Part 2 Experiments Controlling Effects of Variables 1. Completely randomized experimental design ( Figure 1-4 (b) ) 2. Randomized block design ( Figure 1-4(c) ) 3. Rigorously controlled design 4. Matched pairs design

39

1.5 Collecting Sample Data – Part 2 Experiments • Rigorously controlled design – Carefully assign subjects to different treatment groups, so that those given each treatment are similar in the ways that are important to the experiment. In an experiment testing the effectiveness of aspirin on heart disease, if the placebo group includes a 30-year-old overweight male smoker who drinks heavily and consumes an abundance of salt and fat, the treatment group should also include a person with similar characteristics.

40

1.5 Collecting Sample Data – Part 2 Experiments • Matched pairs design – Compare exactly two treatment groups (such as treatment and placebo) by using subjects matched in pairs that are somehow related or have similar characteristics. A test of Crest toothpaste used matched pairs of twins, where one twin used Crest and the other used another toothpaste.

41

1.5 Collecting Sample Data – Part 2 Experiments Summary Three very important considerations in the design of experiments: • Use randomization to assign subjects to different groups • Use replication by repeating of experiment on enough subjects so that effects of treatments or other factors can be clearly seen • Control the effects of variables by using such techniques as blinding ad a completely randomized experimental design

42

1.5 Collecting Sample Data – Part 2 Sampling Errors •

A sampling error is the difference between a sample result and the true population result; such an error results from chance sample fluctuations

A nonsampling error occurs when the sample data are incorrectly collected, recorded, or analyzed (such as by selecting a biased sample, using a defective measure instrument, or copying the data incorrectly).

43