INTRODUCTION TO SURVEY SAMPLING

INTRODUCTION TO SURVEY SAMPLING February 14, 2018 Linda Owens www.srl.uic.edu General information  Please hold questions until the end of the prese...
1 downloads 0 Views 935KB Size
INTRODUCTION TO SURVEY SAMPLING February 14, 2018 Linda Owens www.srl.uic.edu

General information 

Please hold questions until the end of the presentation



Slides available at http://www.srl.uic.edu/seminars.htm



Please raise your hand so that I can see that you can hear me

2

Outline Introduction Target Populations Sample Frames Sample Designs Determining Sample Sizes Modes of Data Collection Questions

      

3

Introduction Census:  Gathering information about every individual in a population Sample:  Selection of a small subset of a population Census Sample

4

Why sample instead of taking a census? 

Less expensive



Less time-consuming



More accurate



Samples can lead to statistical inference about the entire population

5

Probability vs. non-probability Probability Sample



  

Generalize to the entire population Unbiased results Known, non-zero probability of selection

Non-probability Sample



  

6

Exploratory research Convenience Probability of selection is unknown

Probability vs Non-Probability Sample

p=n/N=10/30=.3333

p=n/?=?=?

Steve Mays, YouTube video on sampling: https://youtu.be/yx5KZi5QArQ Rahul Patwari, YouTube video on non-probability sampling: https://youtu.be/-kwdXEXC7yE

7

Target population Definition:

The population to which we want to generalize our findings



Unit of analysis: Individual/Household/City



Geography: State of Illinois/Champaign County/City of Urbana



Age/Gender



Other variables

8

Examples of target populations



Population of adults in Champaign County



Faculty, staff, or students at the University of Illinois



Youth age 5 to 18 in Champaign County



Registered Voters

9

Sampling frame    

Before you can ask people to answer your questions, you have to make contact with them How will you do that? Sampling frame is the mechanism that makes that possible Information on sampling frame has bearing on mode of data collection

10

Sampling frame 

A complete list of all units, at the first stage of sampling, from which a sample is drawn



For example, lists of . . .  addresses  landline phone numbers in specific area codes  blocks or census tracts in specified geographic areas  members of professional organization  schools  cell phone numbers

11

Target populations, sample frames, and coverage Example 1:  Population: Adults in Champaign County, IL  Frames: List of landline numbers, list of census blocks, list of addresses Example 2:  Population: Youth age 5 to 18 in Cook County  Frame: List of schools Example 3:  Population: Adults age 18-34 in United States  Frame: ?? Coverage: How well does the sample frame represent the target population?

12

Coverage Error

Target Population Sample Frame

13

Sample designs for probability samples 

Simple random samples



Systematic samples



Stratified samples



Cluster



Multi-stage



Combination (e.g. stratified cluster sample)

14

Simple random sampling (SRS) 

Definition: Every element has the same probability of selection and every combination of elements has the same probability of selection.



Probability of selection: n/N, where n = sample size; N = population size



Use Random Number tables, software packages to generate random numbers



Most precision estimates assume SRS

15

Simple Random (6 out of 30)

16

Systematic sampling 

Definition: Every element has the same probability of selection, but not every combination can be selected.



Use when drawing SRS is difficult  List of elements is long & not computerized



Procedure  Determine population size N and sample size n  Calculate sampling interval (N/n)  Pick random start between 1 & sampling interval  Take every ith case  Problem of periodicity

17

Systematic Sample (every 5th)

18

Stratified sampling: Proportionate 

To ensure sample resembles some aspect of population



Population is divided into subgroups (strata)  Students by year in school  Faculty by gender



Simple Random Sample (with same probability of selection) taken from each stratum. Sampling fraction is the same for all strata, regardless of population in each stratum. Larger strata will have larger sample

 

19

Proportionate Stratified Sample (sampling fraction=1/5) N=25 (n=5)

20

N=10 (n=2)

N=15 (n=3)

Stratified sampling: Disproportionate 

Major use is comparison of subgroups



Population is divided into subgroups (strata)  

Compare girls & boys who play Little League Compare seniors & freshmen who live in dorms



Probability of selection needs to be higher for smaller stratum (girls & seniors) to be able to compare subgroups.



Requires weighting to adjust for different probabilities of selection

21

Disproportionate Stratified Sample (n=12--4 from each stratum, overall p=.24) p=4/25=.16

22

p=4/10=.40

p=4/15=.267

Cluster/Multistage sampling 

Typically used in face-to-face surveys



Population divided into clusters  





Schools (earlier example) Blocks

Draw a sample of clusters 

Include every member of cluster (=cluster sample)



Select random sample of cluster members (=multistage sample)

Reasons for cluster sampling  

Reduction in cost No satisfactory sampling frame available

23

Cluster Sample

24

Complex Sample Designs  

Combination of sample strategies Example: multistage, stratified sample of adults in Chicago Stratify census blocks into groups based on predominant racial/ethnic group Draw a sample of census blocks from each stratum Draw a sample of housing units from each sampled census block Sample one respondent from all eligible adults in the household Each sampling stage has its own probability of selection Final probability of selection of eligible adult is product of all stages

1. 2. 3. 4. 5. 6.

25

Determining sample size: SRS 

Need to consider  





26

Precision Variation in subject of interest

Formula 

Sample size

no = CI2 * (pq) Precision



For example:

no = 1.962 * (.5 * .5) .052

Sample size not dependent on population size (except finite population correction)

Sample size: Other issues 

Finite Population Correction (FPC) 

Use when sample >5% of pop ௡ᇲ ே

݊ = ݊ᇱ /(1 + ) 

Design effects



Analysis of subgroups



Increase size to accommodate nonresponse



Cost 27

Modes of data collection 

Face to face



Phone



Web



Mail

28

Target population/frame/mode correspondence 

Mode needs to be consistent with information in sample frame



Mode needs to be consistent with target population

29

Cell phone and landline frames 





Increasing proportion of US households are cell phone only (52.5% in 2017, 5.9% landline only)  https://www.cdc.gov/nchs/data/nhis/earlyrelease/wireless2 01712.pdf (Blumberg & Luke) Cell phone only households tend to be • Unrelated adults • Hispanic adults • Younger • Lower SES • But…… Landline sample frames can will lead to bias 30

Cell phone and landline frames, cont. 

Cell phone frames harder to target geographically than landline frames



Survey researchers are combining landline and cell phone frames

31

Address-based sampling 

Sampling addresses from a near universal listing of residential mail delivery locations Post Office Delivery Sequence Files (DSF)



32

Address-based sampling: advantages



Coverage of households is very high



Can be matched to name and listed telephone numbers



Includes non-telephone households



More efficient than traditional block-listing

33

Address-based sampling: disadvantages



Incomplete in rural areas (although improving with 9-1-1 address conversion)



Difficulties with “multidrop” addresses



Best used with mail or face to face surveys.



Can be used for web surveys with some additional effort/cost 34

Thank you! Future noontime webinars 

Introduction to Questionnaire Design, Wednesday, February 21



Survey Response Rates: Uses and Misuses, Wednesday, February 28

35

Evaluation

36

Questions

37

Resources 

Books on Sampling: the Classics • • •

   

Leslie Kish, Survey Sampling, 1965 William Cochrane, Sampling Techniques, 3rd Ed. 2007 Seymour Sudman, Applied Sampling, 1976

Sharon Lohr, Sampling: Design and Analysis, 2009 https://www.cdc.gov/nchs/nhis/releases.htm#wireless Rahul Patwari, YouTube video on non-probability sampling: https://youtu.be/-kwdXEXC7yE Steve Mays, YouTube video on sampling: https://youtu.be/yx5KZi5QArQ 38