Training of Trainers (ToT) in Interdisciplinary Field Research Methodology (IFRM)/SaciWATERs

Module 5 –Quantitative Survey Lecture notes prepared by Dr. Shahjahan Mondal for Training of Trainers on Interdisciplinary Field Research Methodology. Co-organized by IWFM, BUET and SaciWATERs held at Cox’s Bazar, Bangladesh from 5- 15 December 2010 I- Statistical Sampling Introduction The basic idea in sampling is extrapolation from the part to the whole—from ―the sample‖ to ―the population.‖ (The population is sometimes rather mysteriously called ―the universe.‖) There is an immediate corollary: the sample must be chosen to fairly represent the population. Methods for choosing samples are called ―designs.‖ Good designs involve the use of probability methods, minimizing subjective judgment in the choice of units to survey. Samples drawn using probability methods are called ―probability samples.‖ Bias is a serious problem in applied work; probability samples minimize bias. As it turns out, however, methods used to extrapolate from a probability sample to the population should take into account the method used to draw the sample; otherwise, bias may come in through the back door. The ideas will be illustrated for sampling people or businesses records, but apply more broadly. There are sample surveys of buildings, farms, law cases, schools, trees, trade union locals, and many other populations. (Refer reading I) Types of Sampling There are two different types of sampling. i) Probability Sampling ii) Non-probability Sampling.

1. Probability Sampling: (Reading II word file and reading III pdf file)

With probability sampling, all elements (e.g., persons, households) in the population have some opportunity of being included in the sample, and the mathematical probability that any one of them will be selected can be calculated

Training of Trainers (ToT) in Interdisciplinary Field Research Methodology (IFRM)/SaciWATERs

Probably the most familiar type of probability sample is the simple random sample, for which all elements in the sampling frame have an equal chance of selection, and sampling is done in a single stage with each element selected independently. Somewhat more common than simple random samples are systematic samples, which are drawn by starting at a randomly selected element in the sampling frame and then taking every nth element (e.g., starting at a random location in a telephone book and then taking every 100th name). In yet another approach, cluster sampling, a researcher selects the sample in stages, first selecting groups of elements, or clusters (e.g., city blocks, census tracts, schools), and then selecting individual elements from each cluster (e.g., randomly or by systematic sampling). Example Suppose some researchers want to find out which of two mayoral candidates is favored by voters. Obtaining a probability sample would involve defining the target population (in this case, all eligible voter and using one of many available procedures for selecting a relatively small number (probably fewer than 1,000) of those people for interviewing. For example, the researchers might create a systematic sample by obtaining the voter registration roster, starting at a randomly selected name, and contacting every 500th person thereafter. Or, in a more sophisticated procedure, the researchers might use a computer to randomly select telephone numbers from all of those in use in the city, and then interview a registered voter at each telephone number. (This procedure would yield a sample that represents only those people who have a telephone.) Several procedures would also be available for recruiting a convenience sample, but none of them would include the entire population as potential respondents. For example, the researchers might ascertain the voting preferences of their own friends and acquaintances. Or they might interview shoppers at a local mall. Or they might publish two telephone numbers in the local newspaper and ask readers to call either number in order to "vote" for one of the candidates. The important feature of these methods is that they would systematically exclude some members of the population (respectively, eligible voters who do not know the researchers, do not go to the

Training of Trainers (ToT) in Interdisciplinary Field Research Methodology (IFRM)/SaciWATERs

shopping mall, and do not read the newspaper). Consequently, their findings could not be generalized to the population of city voters. ii) Non-probability sampling With non-probability sampling, in contrast, population elements are selected on the basis of their availability (e.g., because they volunteered) or because of the researcher's personal judgment that they are representative. The consequence is that an unknown portion of the population is excluded (e.g., those who did not volunteer). One of the most common types of non-probability sample is called a convenience sample – not because such samples are necessarily easy to recruit, but because the researcher uses whatever individuals are available rather than selecting from the entire population. There are three different types of non-probability sampling: –

Accidental sampling: information collected from any one in sight in a road, market, bus station



Judgement/Purposive sampling: A company allots lands to its employees, housing satisfaction



Quota sampling: Fertility level with place of residence; Selection of respondents is at the choice of interviewers

Sample Size Sample size plays a vital role in determining the representativeness of the set that a researcher considers for sampling. It depends The Maximum allowable/acceptable error that a sampling can allow or it can refer to the accuracy that a researcher desires (i.e., required accuracy). When the sample size is small in its representativeness, the change of increase in error is on the higher side. Confidence interval (95% or 99%) of the sampling that a researcher needs also determines the sample size. With the increase in CI, sample size will increase Population variance: can be estimated from the following ways o

Two stages of sampling and their variances

o

Pilot survey

o

Literature

Training of Trainers (ToT) in Interdisciplinary Field Research Methodology (IFRM)/SaciWATERs

o

Educated guess

Example of Sample size determination  Avg. no. of children per family in a town of 4000 families.  We want our estimate to be accurate within 0.3 children (i.e., CI is 0.6).  The best available estimate of σ is 3.  Z10.5 / 2  Z 0.975  1.96

from a Table

 Estimated sample size is 384  After finite population correction, it is 350 The different statistical methods for estimating Confidence Interval, Margin of error and sample size are given below



Confidence Interval of mean

X  Z1 / 2

Margin of error

e  Z1 / 2

Sample size

  n   Z1 / 2  e 

Finite population correction

N /( N  n)

n

 n 2

This sample size is based on the assumption that the population is infinite or so large that the sampling fraction is negligible. Practical Problems Every research investigates simultaneously a number of variables with differing variabilities. A variable with a greater variability will require a larger sample to achieve a certain precision level than a variable with a smaller variability. When we use the largest sample, Cost is a problem. Choose the sample size based on the variable for which the greatest precision is required. For stratified sampling, if the estimate of the variability is available within each stratum, the number of elements required in each stratum & hence the total sample size can be determined. For a

Training of Trainers (ToT) in Interdisciplinary Field Research Methodology (IFRM)/SaciWATERs

cluster sample, if estimates of variances between & within PSUs are available, the size of the required sample can be estimated Sample Design Probability samples should be distinguished from ―samples of convenience‖ (also called ―grab samples‖). A typical sample of convenience comprises the investigator‘s students in an introductory course. A ―mall sample‖ consists of the people willing to be interviewed on certain days at certain shopping centers. This too is a convenience sample. The reason for the nomenclature is apparent, and so is the downside: the sample may not represent any definable population larger than itself. To draw a probability sample, we begin by identifying the population of interest. The next step is to create the ―sampling frame,‖ a list of units to be sampled. One easy design is ―simple random sampling.‖ For instance, to draw a simple random sample of 100 units, choose one unit at random from the frame; put this unit into the sample; choose another unit at random from the remaining ones in the frame; and so forth. Keep going until 100 units have been chosen. At each step along the way, all units in the pool have the same chance of being chosen. Simple random sampling is often practical for a population of business records, even when that population is large. When it comes to people, especially when face-to-face interviews are to be conducted, simple random sampling is seldom feasible: where would we get the frame? More complex designs are therefore needed. If, for instance, we wanted to sample people in a city, we could list all the blocks in the city to create the frame, draw a simple random sample of blocks, and interview all people in housing units in the selected blocks. This is a ―cluster sample,‖ the cluster being the block. Notice that the population has to be defined rather carefully: it consists of the people living in housing units in the city, at the time the sample is taken. There are many variations. For example, one person in each household can be interviewed to get information on the whole household. Or, a person can be chosen at random within the household. The age of the respondent can be restricted; and so forth. If telephone interviews are to be conducted, ―random digit dialing‖ often provides a reasonable approximation to simple random sampling—for the population with telephones.

Classification of Errors

Training of Trainers (ToT) in Interdisciplinary Field Research Methodology (IFRM)/SaciWATERs

Since the sample is only part of the whole, extrapolation inevitably leads to errors. These are of two kinds: sampling error (―random error‖) and non-sampling error (―systematic error‖). The latter is often called ―bias,‖ without connoting any prejudice. Sampling error results from the luck of the draw when choosing a sample: we get a few too many units of one kind, and not enough of another. The likely impact of sampling error is usually quantified using the ―SE,‖ or standard error. With probability samples, the SE can be estimated using (i) the sample design and (ii) the sample data. As the ―sample size‖ (the number of units in the sample) increases, the SE goes down, albeit rather slowly. If the population is relatively homogeneous, the SE will be small: the degree of heterogeneity can usually be estimated from sample data, using the standard deviation or some analogous statistic. Cluster samples—especially with large clusters—tend to have large SEs, although such designs are often cost-effective. Non-sampling error is often the more serious problem in practical work, but it is harder to quantify and receives less attention than sampling error. Non-sampling error cannot be controlled by making the sample bigger. Indeed, bigger samples are harder to manage. Increasing the size of the sample—which is beneficial from the perspective of sampling error—may be counterproductive from the perspective of non-sampling

Non-sampling error itself can be broken down into three main categories: (i) selection bias, (ii) non-response bias, and (iii) response bias. We discuss these in turn. (i) ―Selection bias‖ is a systematic tendency to exclude one kind of unit or another from the sample. With a convenience sample, selection bias is a major issue. With a well-designed probability sample, selection bias is minimal. That is the chief advantage of probability samples. (ii) Generally, the people who hang up on you are different from the ones who are willing to be interviewed. This difference exemplifies non-response bias. Extrapolation from respondents to non-respondents is problematic, due to non-response bias. If the response rate is high (most interviews are completed), non- response bias is minimal. If the response rate is low, non- response bias is a problem that needs to be considered. At the time of writing, U.S. government surveys that accept any respondent in the household have response rates over 95%. The best face-to-face research surveys in the U.S., interviewing a randomly-selected adult in a household, get response rates over 80%. The best telephone surveys get response rates approaching 60%. Many commercial surveys have much lower response rates, which is cause for concern.

Training of Trainers (ToT) in Interdisciplinary Field Research Methodology (IFRM)/SaciWATERs

(iii) Respondents can easily be lead to shade the truth, by interviewer attitudes, the precise wording of questions, or even the juxtaposition of one question with another. These are typical sources of response bias. Sampling error is well-defined for probability samples. Can the concept be stretched to cover convenience samples? That is debatable (see below). Probability samples are expensive, but minimize selection bias, and provide a basis for estimating the likely impact of sampling error. Response bias and non-response bias affect probability samples as well as convenience samples. Reading Materials 1. Doherty, M. (1994) Probability versus Non-Probability Sampling in Sample Surveys, The

New Zealand Statistics Review March 1994 issue, pp 21-28. 2. Greg Kochanski, Statistical Sampling,

http: // kochanski.orghttp://kochanski.org/gpk/teaching/ 0401Oxford 3. Stratified statistical sampling, Florida Department of Revenue Training Manual.

II- Questionnaire Survey What is a Questionnaire? A questionnaire is a research instrument consisting of a series of questions and other prompts for the purpose of gathering information from respondents. Although they are often designed for statistical analysis of the responses, this is not always the case. The questionnaire was invented by Sir Francis Galton. Questionnaires have advantages over some other types of surveys in that they are cheap, do not require as much effort from the questioner as verbal or telephone surveys, and often have standardized answers that make it simple to compile data. However, such standardized answers may frustrate users. Questionnaires are also sharply limited by the fact that respondents must be able to read the questions and respond to them. Thus, for some demographic groups conducting a survey by questionnaire may not be practical. In field research, the researcher collects information by observing the social phenomena directly and as completely as possible. But in case where the population is too large to observe directly, the survey research is probably the best option for collecting primary data. It is one of the most powerful and frequently used methods in social sciences. A set of questions are used. The method is suitable for obtaining information for explaining social phenomena and exploratory

Training of Trainers (ToT) in Interdisciplinary Field Research Methodology (IFRM)/SaciWATERs

research. A method is different from a tool. While a method refers to the way or mode of gathering data, a tool is an instrument used for the method. For example, a schedule is used for interviewing. Questionnaire may contain a mixture of questions or statements (some items better served, if statements and some, if questions). Consider the following examples. a) Do you believe that the death penalty is ever justified? - Yes - No b) The death penalty is justifiable under some circumstances 1 Strongly

2 Disagree

3 Neutral

4 Agree

5 Strongly agree

Disagree A distinction can be made between questionnaires with questions that measure separate variables and questionnaires with questions that are aggregated into either a scale or index. Questionnaires within the former category are commonly part of surveys, whereas questionnaires in the latter category are commonly part of tests. Questionnaires with questions that measure separate variables could include questions on: 

preferences (e.g. political party)



behaviours (e.g. food consumption)



facts (e.g. gender)

Questionnaires with questions that are aggregated into either a scale or index include questions that measure: 

latent traits (e.g. personality traits such as extroversion)



attitudes (e.g. towards immigration)



an index (e.g. Social Economic Status)

A food frequency questionnaire (FFQ) is a questionnaire to assess the type of diet consumed in people, and may be used as a research instrument. Examples of usages include assessment of intake of vitamins or toxins

Training of Trainers (ToT) in Interdisciplinary Field Research Methodology (IFRM)/SaciWATERs

Basic rules for questionnaire construction Statements which are interpreted in the same way by members of different sub-populations of the population of interest should be used. Having an "open" answer category after a list of possible answers to be thought of. Only one aspect of the construct you are interested in per item should be used. Positive statements to be used and negatives or double negatives should be avoided. Assumptions about the respondent should not be made. Clear and comprehensible wording, easily understandable for all educational levels should be followed. Correct spelling, grammar and punctuation to be used. Items that contain more than one question per item should be avoided (e.g. Do you like strawberries and potatoes?).

Questionnaire Survey  Types of questionnaires Structured questionnaire • •

Identical questions for everyone in terms of wording and sequence Validity of results

Unstructured questionnaire •

Not the same set of questions (wording and sequence) for everybody



Heterogeneous respondents



Education, vocabularies, level of comprehension

Types of questions - questions should be based on factual, opinion, motivation and knowledge - questions should be Short, Clear, Easy to understand and answerable

Training of Trainers (ToT) in Interdisciplinary Field Research Methodology (IFRM)/SaciWATERs

without difficulties

Closed-ended and Open-ended questions Closed ended - A closed-ended question has the respondent pick an answer from a given number of options. The response options for a closed-ended question should be exhaustive and mutually exclusive. There is only little or no room for respondents to volunteer additional information. Eg:- The death penalty is justifiable under some circumstances

1 Strongly

2 Disagree

3 Neutral

4 Agree

5 Strongly agree

Disagree

Open ended - An open-ended question asks the respondent to formulate his own answer where the respondent has to complete a sentence. Respondents have freedom to provide their answers Eg:- What do you think of the performance of the water users' association? In general, If nature, range and diversity of responses are known – closed ended Otherwise open-ended may be appropriate Characteristics of a good questionnaire •

Accurate communication



Accurate response



Attractive form and style



Direct, simple and unambiguous questions



The questions should not be so intimate



The questions must be directly related to the specialized problem being explored



The responses should be such as classifiable and amenable to statistical treatment

Training of Trainers (ToT) in Interdisciplinary Field Research Methodology (IFRM)/SaciWATERs

Construction of questionnaire Step-1: decide variables directly related to the hypothesis (or research questions) (plus other variables - e.g. background variables) Step-2: frame questions to obtain information on variables  Cover letter (in case of self-administered questionnaire)  Introductory statements (in case of interview)  Confidentiality of the responses  Instructions on how to fill in

Questionnaire length  Should cover only those that are absolutely necessary (else may affect quality of information, increase cost)  Spread questions over a large number of pages (but not too lengthy)  Try to avoid contingency questions Question content  As practicable (understandable) as possible  Make respondents feel that questions are relevant to them  Avoid questions involving an event that occurred long, long ago. (eg. Global warming)

Question Wording  Use simple language as possible (may be ‗technical‘ in nature for certain professional groups)  Make questions as specific as possible How much money did you spend for irrigation in the last three months? How much money do you generally spend for irrigation?  Make questions unambiguous (specify frame of reference) a) What irrigation method do you use? b) What irrigation method do you use? LLP

STW

DSSTW

DTW

Training of Trainers (ToT) in Interdisciplinary Field Research Methodology (IFRM)/SaciWATERs

 Avoid double barreled questions Do you find river and underground water sufficient for

irrigation?

 Avoid ‗leading‘ or ‗biased‘ questions Negatively worded, less categories, loaded questions, tag questions, presumptuous  Avoid questions loaded with words Don't you agree that this ToT on IFRM is very useful?  Avoid presuming questions What is your role in decision making about land management?  Avoid vague words Do you use bottled water regularly or occasionally? (kind of, fairly, generally, often, many, much the same, on the whole, etc. are vague words) Vagueness occurs with 'why' questions  Be careful about asking private and embarrassing questions How much fertilizer did you use for cultivation of rice in the last season? What varieties of rice did you cultivate in the last season? What crops are sensitive to climate change?

Question sequence  Questions should flow logically from one to the next.  Easy to answer, interesting questions in the beginning; least sensitive to most sensitive, ego-threatening and dull questions at the end  Factual and behavioural to the attitudinal and from the more general to the more specific.  Make the respondents feel that the whole process is a meaningful exercise  General questions about the subject matter, then narrowing down to specific issues; each successive question is related to the previous question but in a more specific way.

Pretesting Before attempting for the questionnaire survey, check the following

Training of Trainers (ToT) in Interdisciplinary Field Research Methodology (IFRM)/SaciWATERs

 Questions are meaningful and responsive  Ambiguous and irrelevant questions are identified  Questionnaire format is suitable  Respondents had understood the contexts  Respondents' eagerness to participate in the survey  Interview time and attention of respondents (30 min)

Pre-tested with 25 local people in 5 administrative units  Sometimes, a part of the survey (eg. pilot survey) Administration of Questionnaire - Self-administration - mailed, emailed (ccomputerised questionnaire administration, where the items are presented on the computer) or delivered in person -

-

Interview -

Telephone

-

Face-to-face where an interviewer presents the items orally.

Questionnaire and interview schedule (Paper-and-pencil questionnaire administration, where the items are presented on paper)

-

Who asks and fills in

Adaptive computerized questionnaire administration is one where a selection of items is presented on the computer and based on the answers on those items, the computer selects following items optimized for the respondent‘s estimated ability or trait.

Interviewing •

Start with standard initial greetings and explain the purpose of survey



Next ask warm-up and general questions

Training of Trainers (ToT) in Interdisciplinary Field Research Methodology (IFRM)/SaciWATERs

Further reading 

Foddy, W. H. (1994). Constructing questions for interviews and questionnaires: Theory and practice in social research (New ed.). Cambridge, UK: Cambridge University Press.



Gillham, B. (2008). Developing a questionnaire (2nd ed.). London, UK: Continuum International Publishing Group Ltd.



Leung, W. C. (2001). How to conduct a survey. Student BMJ, 9, 143-5.



Mellenbergh, G. J. (2008). Chapter 10: Tests and questionnaires: Construction and administration. In H. J. Adèr & G. J. Mellenbergh (Eds.) (with contributions by D. J. Hand), Advising on research methods: A consultant's companion (pp. 211–234). Huizen, The Netherlands: Johannes van Kessel Publishing.



Mellenbergh, G. J. (2008). Chapter 11: Tests and questionnaires: Analysis. In H. J. Adèr & G. J. Mellenbergh (Eds.) (with contributions by D. J. Hand), Advising on research methods: A consultant's companion (pp. 235–268). Huizen, The Netherlands: Johannes van Kessel Publishing.



Munn, P., & Drever, E. (2004). Using questionnaires in small-scale research: A beginner's guide. Glasgow, Scotland: Scottish Council for Research in Education.



Oppenheim, A. N. (2000). Questionnaire design, interviewing and attitude measurement (New ed.). London, UK: Continuum International Publishing Group Ltd.



Kothari, C.R. Research methodology – Methods and techniques. New Age International Publishers (P) Ltd., New Delhi



Pauline V young (1956). Scientific Social Surveys. Prentice Hall