How to Measure Attitudes, Behaviour, and Traits

C H A P T E R 7 How to Measure Attitudes, Behaviour, and Traits ”Whatever exists, exists in some quantity; whatever exists in quantity can be measur...
Author: Carmel Jenkins
7 downloads 1 Views 242KB Size
C H A P T E R

7

How to Measure Attitudes, Behaviour, and Traits ”Whatever exists, exists in some quantity; whatever exists in quantity can be measured.” F.L. Thorndike After completing this chapter, you should be able to:

In the last chapter, we saw how to design a questionnaire in general. However, there is more to writing a good questionnaire than knowing the general princi-

1. Understand the purpose of screening

ples. We also need to know how to ask specific questions that relate to con-

and warm-up questions and write screeners.

sumer attitudes, behaviour, and traits. Different information requirements may

2. Understand how consumer behaviour is measured in marketing research.

call for different types of questions. Some common areas of concerns are: ●

beginning of a questionnaire? Should we ask the difficult questions or

3. Develop questions that measure

the easy ones first? Does it ever make sense to ask screening questions

behaviour.

4. Understand what attitudes are and how they are related to consumer behaviour.

later in the questionnaire? ●

we ask, (a) “How often do you go to the movies?” (b) “How many

of scales used in attitude measurement.

movies did you view last month?” or (c) “How many movies do you typically see in a month?” ●

which different scales are used.

what type of scale? ●

questions differ from attitudinal questions?

and psychographic measurements. ●

lifestyles and psychographics.

10. Measure demographics for marketing purposes.

Lifestyle. When we want to understand a consumer’s lifestyle or psychographics, what type of questions should we ask? How do lifestyle

8. Understand the meaning of lifestyle 9. Create statements to measure

Attitudes. What exactly are attitudes and how do we measure them? Do we use a dichotomous question or do we use a scale? If it is a scale,

7. Select and use an appropriate scale to measure attitudes in a given context.

Behaviour. How do we measure behaviour? Suppose we want to find out how many movies an adult Canadian views during a typical month. Do

5. Understand the different types

6. Understand the conditions under

Screening and Warm-Up. What types of questions are suitable at the

Demographic traits. Where do we place demographic questions in the questionnaire? How do we ask them? Suppose we want to ask about a person’s household income, do we simply ask, “What is your household income?”

Obviously these concerns are not trivial. They involve important measurement issues and we touched on some of them in the previous chapter. In this chapter, however, we will cover these important measurement issues in greater detail.

EXHIBIT 7-1

How to Capture Information Demographics Psychographics/ lifestyle Attitudes Behaviour

Screener

How to Write Screeners and Warm-Up Questions Strictly speaking, screening questions do not “measure” anything. However, measurement of what follows the screener can be seriously affected if the screening questions are incorrect, biased, or misleading. Therefore, it is important to pay special attention to screeners and warm-up questions when we are about to measure anything of significance. Screening questions identify whether the person we contacted is eligible to answer the questions at all. For instance if, in a telephone survey, we want to interview adults who bought Campbell’s soup in the past month, we need to establish the following points: 1. The person answering the phone is an adult; and 2. The person answering the phone has bought Campbell’s soup in the past month. While this seems fairly straightforward—and most of the time it is—sometimes it may pose special challenges. Let us consider the first point. This contains two measurement issues: The first issue is what we mean by an “adult.” Before we can establish whether a person is an “adult,” we need to define who an “adult” is. An adult can be a person who is 18 years of age, 17 years of age, or 21 years of age depending on the purpose of the study. (For instance, in Ontario, the legal drinking age is 19, but the legal voting age is 18.) This we cannot make up as we go along, but should state explicitly. Let us assume that, in this study, we define an adult as someone who is at least 18 years of age. The second issue is how to ask this question. We can simply ask the respondent what his or her age is, but this is not a good idea because age is personal information and the interviewer has not established any rapport with the respondent at this stage. Asking personal questions at this point may invite the respondent to refuse to answer. A question like, “I need to interview people in your area who are 18 years of age or older. Could you please tell me if you are 18 years of age or older?” is likely to elicit greater co-operation than a straightforward question on age for two reasons: first, the interviewer tells the respondent the reason for asking the question, and second, the question does not ask the respondent her or his specific age. Sometimes even this approach can be problematic. Suppose you want to interview How to Measure Attitudes, Behaviour, and Traits

| 229

people in the households where the household income is $50 000 or higher and you phrase the question as follows: I need to interview people in your area whose household income is $50 000 or higher. Could you please tell me if your household income is $50 000 or higher?

There might be a general cultural presumption that a higher income is socially more desirable than a lower income. Therefore, those people with household incomes less than $50 000 may either feel uncomfortable or, in some cases, may even inflate their income. A better way to ask the question might be to make up categories below $50 000. The question can be phrased as follows: What is your household income? Would it be . . . ? Under $25 000 $25 000 to $49 999 $50 000 and higher?

–––––– –––––– ––––––

This question is more likely to elicit better information because those who earn between $25 000 and $49 999 have an income category below them. Therefore, even if some who belong to the lowest income category claim to belong to the next higher income category, they still will not qualify according to the screening criterion. The screening question is designed to identify the relevant person to be interviewed. Thus, we should minimize the chances of including a respondent who is not relevant. As another example, suppose we want to interview only those who visited a bookstore last month. If we simply ask, “Have you visited a bookstore in the past 30 days?” some people might be tempted to answer yes for various reasons: to please the interviewer by answering the question positively and to provide an answer that is likely to be socially more acceptable. We can counter this tendency in these ways. (1) We can make the negative part of the answer more acceptable by including it in the question, as mentioned in the previous chapter: “Did you or did you not visit a bookstore in the past 30 days?”1; and (2) We can make what we are looking for less obvious by including other alternatives that are likely to be answered in a positive way. Have you done any of the following in the past 30 days? Watched TV Gone to a cinema Gone walking Visited a bookstore Visited a restaurant

–––––– –––––– –––––– –––––– ––––––

Is there ever a case for placing the screening question at the end of a questionnaire? The answer is yes, although it is not very common. There are situations in which respondents’ knowing or even guessing the purpose of the study may affect their responses. For example, we may want to ask consumers whether they have seen an ad for Bacardi. We may be interested in what they thought of Bacardi only if they have seen the ad. However, asking them if they have seen the ad may remind them of the ad’s content and might, conceivably, influence their responses. In such cases, respondents are not screened but interviewed as if everyone qualified. Screening questions are asked at the end and questionnaires completed by respondents who do not qualify are discarded. 1

To keep our examples clear and uncluttered, we discuss one principle at a time. The question itself should not be considered complete in all respects. For instance, in actual questionnaires, most questions will include a “do not know” response, computer coding, etc. You may also want to consider using the negative option explicit in some questions (e.g. Have you or have you not visited a library in the past 30 days?)

230 |

CHAPTER 7

Sometimes, especially when the study deals with a sensitive subject or a complex topic, you may want to include a few “warm-up” questions. Warm-up questions are designed to establish rapport between the interviewer and the respondent. They should be short, easy to answer, and not be of a personal nature. If long, difficult to answer, or sensitive questions are placed at the beginning of a questionnaire, they make it difficult to establish rapport quickly. Consider the following questions. Let us assume that the government deficit is so high that the only sensible way out is through taxation. When you take this into account in conjunction with the restrictive monetary policy, are you in favour of increased taxes or not?

For some respondents, this question may be difficult to answer because it contains words/phrases such as “restrictive monetary policy” and “government deficit” that are not commonly used. In the past week, by that I mean Monday to Friday only, how many issues of The Star did you personally happen to read or look into either in your home or elsewhere?

Because this question is long and requires the respondent to remember a series of phrases such as “Monday to Friday,” “personally happen to,” “read or look at,” “in your home or elsewhere,” the respondent may feel frustrated. Would you say that you have more marital problems now compared to this time last year?

This is a senstive question and should not be asked at the beginning of an interview. If at all possible, questions that are long and difficult to answer and questions that are of a sensitive nature should, in general, be avoided anywhere in the questionnaire. However, people do answer sensitive questions and it is easier to obtain responses to these types of questions once the interviewer has established rapport with the respondent. Therefore, when our questionnaires include sensitive questions, it might be a good strategy to start with a few warm-up questions to increase respondent co-operation.

General Overview of Measurement and Scaling A standardized questionnaire can be viewed as a measurement tool. In broad terms, a questionnaire measures attitudes, behaviour, and traits. Measuring each one of these categories involves special challenges. The way a question is asked will affect the results; an ordinary person may not notice subtle differences in wording, so answers may vary widely. Even an apparently simple question like “How many people work in your company?” may produce a wide variety of differences depending on how people interpret it: some may include part-time employees, others may not; some may include employees in all locations and others may include only employees in a particular location; and so on. There are also many ways to measure attitudes. Some attitudes are held strongly, others less strongly. For this reason, when we measure attitudes in marketing research, we often use scales. Scales can be classified in many ways: numeric and semantic (verbal), comparative (such as paired comparison and constant sum) versus noncomparative, and so on, depending on perspective. Here we begin with nominal (Yes-No) types of scales, discuss ranking scales, and then move on to rating scales. We also discuss other related scales such as paired comparison, semantic differential, constant sum, and Likert scales. There are literally hundred of scales and several theoretical frameworks to classify them. The scales that we have chosen to describe are those that are most frequently used in marketing research. As each scale is described, we also point out when that particular scale should be used. How to Measure Attitudes, Behaviour, and Traits

| 231

Warm-up questions are designed to establish rapport between the interviewer and the respondent. They should be short, easy to answer, and not of a personal nature.

How to Measure Attitudes Attitudes deal with consumer perceptions and judgments. They exist in the mind of consumers and cannot be directly observed. Attitudes are assumed to be relatively enduring (i.e., attitudes change slowly).

Attitudes deal with consumer perceptions and judgments. How well does the consumer like product A? How important is safety to a consumer who is about to buy a car? How does the consumer feel about free trade? Questions like these relate to attitudes. Attitudes exist in the mind of consumers and cannot be directly observed. Attitudes are assumed to be relatively enduring (i.e., attitudes change slowly). This means that a person who has an attitude (i.e., he or she favours free trade or likes a fast car) is likely to hold on to that attitude for a while—perhaps for months, years, or even forever. Attitudes are assumed to influence behaviour. For example, if you like a product very much (attitude), you are likely to buy that product (behaviour). In marketing research, the term attitude is not used as a pure psychological construct. Rather, it is used as a convenient shortcut to include a wide variety of affective (emotional), cognitive (perceptual), and behavioural aspects of the respondent, which are self-reported. It includes consumer evaluation of products and services, intent to purchase, etc. Since one of our aims is to understand how a consumer would behave—what they would buy, how much they would buy, etc.—we measure attitudes to help us predict behaviour. Writing questions to capture attitudes are subject to two considerations: 1. Existence. Does a consumer hold a given attitude? 2. Intensity. If so, how strongly does the consumer hold that attitude? Sometimes it is enough to know that consumers hold certain attitudes (e.g., companies are not concerned about the community). But in most cases, we need to know how strongly they hold their attitudes as well. Because attitudes are not directly observable, we have to depend on consumers to tell us what attitudes they hold and how strongly they hold them. As a result, we can never be completely certain about the accuracy of our measurements. However, experience shows that attitudes can be measured in ways that are, in many instances, useful enough for making marketing decisions. Marketers know that there is no one-to-one correspondence between attitude and behaviour. If 100 consumers state that they would buy Brand X, it is extremely unlikely that all of them will indeed buy X (even assuming that consumers did not intentionally falsify their attitude toward the brand). However, in general, some relationships are found to hold such as the following: ●





A certain proportion of those who say they would buy Brand X would indeed buy it. For example, if 100 consumers say that they would buy Brand X next time, 40 might actually buy it. The proportion may vary by product category. For instance, of car owners who say that they would buy the same model again, only 30 percent might do so while, for users of a certain brand of toothpaste, a similar percentage might be 90 percent. The intensity of attitudes is (ordinally) reflected in related behaviours. For instance, larger proportions of those who say that they would “definitely buy Product X” are more likely to buy it compared to those who only say that they would “buy Product X.”

In practice, this means that attitudes can be measured in standard ways, usually using some type numeric or semantic scale. Scales attempt to quantify the existence and the intensity of attitudes. In marketing research, except in special cases, both the existence and the intensity of an attitude are assumed to be the same as that reported by the individual. Even though we assume this, it is still our responsibility to make sure that attitudes are measured correctly. 232 |

CHAPTER 7

RELIABILITY AND VALIDITY REVISITED To put it another way, the measurement of an attitude should be valid, reliable, and sensitive: as long as the consumer holds that attitude, we should get the same measurement; and when the consumer changes her or his attitudes, it should be reflected in the measurement. Although reliability and validity were dealt with in an earlier chapter, let’s review it with the following two examples. 1. Validity. Suppose two similar studies are carried out among sales people in a certain industry. Assume that the sales estimates in the two studies were very close. This means the sales estimates are reliable—both studies produced a similar estimate of sales, but the estimate may not necessarily be reliable. Suppose salespeople in this industry tend to exaggerate their sales figures by about 20 percent, therefore, the sales estimates are not valid because they are off by 20 percent. (A subtle point: if the aim is to measure what salespeople actually say about their sales volume, then the estimate is valid; in this case, what we are measuring is what sales people say and not what they actually did.) 2. Reliability. Consider a large-scale study using 2500 respondents. A sample size of this magnitude will have an estimated margin of error of ± 2 percent (at the 95 percent level of confidence). So if we repeat the study with another sample of 2500 respondents, we will get similar results. On the other hand, if our study had only 100 respondents, our margin of error would be of ± 10 percent (at the 95 percent level of confidence). If the study is repeated with another sample of 100, the results may not be close. Assuming that all these studies were carried out correctly, all studies are equally valid. However, studies with smaller samples are less reliable. While reliability and validity are largely independent of each other, we should note that extreme unreliability could result in a lack of validity for all practical purposes even when the study is technically valid. For example, a survey with a margin of error of ± 20 percent is not only unreliable but may well be invalid if the results cannot be meaningfully used in a given context because of the high error margin.) This implies that attitudes should be measured so that their validity (Do they measure what they are supposed to measure?) and reliability (Do they measure what they measure consistently?) can be tested or, at least, defended. The third condition, sensitivity, is a critical concept of measurement but, like reliability, depends on the measurement context. Consider the question, “Were you satisfied with the service that you received when you visited The Bay the last time?” Is it sensitive? It is, if our purpose is to assess the overall reaction. However, it is quite insensitive if our aim is to understand how satisfied a customer is with the service offered. In the above example, a person who says that he or she is not satisfied can be mildly dissatisfied or completely unhappy with the service received. If the aim is to understand the level of satisfaction, then the question as phrased is insensitive. Therefore to construct attitudinal questions, we need to ask ourselves these three questions: 1. Validity. Does this question (or battery of questions) measure what we want it to measure? 2. Reliability. Will this question (or battery of questions) measure what we want it to measure consistently? 3. Sensitivity. Is the measurement scale that we use sensitive enough to detect the shade of meaning that we need?

How to Measure Attitudes, Behaviour, and Traits

| 233

EXHIBIT 7-2

Attitudinal Scales Attitude Scales

Dichotomous Scales

Ranking Scales

Rating Scales

Paired Numeric Comparison (e.g., 5-pt, 7-pt, 10-pt) Unipolar

Bipolar

Graphic Scales Semantic (Verbal Scales)

Semantic Differential

Comparative Scales

Paired Comparison

Constant Sum

Likert-type

Constant Sum* * These are comparative methods

ATTITUDINAL SCALING The topic of attitude scaling is an issue of much debate in social sciences. For a beginning marketing researcher, it amounts to creating different numeric and verbal scales in order to identify the existence and intensity of consumer opinions that are assumed to be relatively stable. Exhibit 7-2 shows the framework used in the discussion that follows.

DICHOTOMOUS SCALES Dichotomous scales assess the existence of an attitude. This type of scale usually leads to a “Yes” or “No” response.

Dichotomous scales offer respondents only two options: “Yes” or “No,” “Like” or “Don’t Like,” “Agree or Disagree”. They are used for “counts” type of data, discussed earlier. Here are a few examples: 1. Do you believe that the death penalty should be brought back or not? Yes ____ No ____ 2. Do you believe that the drinking age should be raised or not? Yes ____ No ____ A variation of the scale is known as the statement endorsement technique. Here a series of statements is presented to the respondent with which the respondent can agree or disagree. 3. Which of the following statements would you agree with? Canada is a highly taxed nation. I am environmentally conscious. We should be more like Americans. Our education standards are too low. WHEN TO USE DICHOTOMOUS SCALES used in three contexts:

Agree _____ _____ _____ _____

Disagree _____ _____ _____ _____

To measure attitudes, dichotomous scales are

1. When the attitude being measured is clear-cut. For instance, a person is against the death penalty for any reason or not against it. 2. When the attitude being measured has shades of meaning, but the researcher is not interested in them. For instance, the researcher may like to know whether a consumer is against higher taxes, but not how strongly. 234 |

CHAPTER 7

3. As a prelude to asking the strength of attitudes. For example, if a consumer says that she or he is against higher taxes, then we may ask whether the consumer is “somewhat strongly” or “very strongly” against higher taxes.

RANKING SCALES Ranking scales are one step above dichotomous scales. Here the respondents are asked to state their order of preference but not necessarily their intensity. Consider the following question: a. Canadians are concerned with taxes, social security, and jobs. Of these three, which one is of the greatest concern to you? _______________ b. Which one is next? _______________ Suppose the respondent says that social security is of the greatest concern followed by taxes. It tells us that social security is more important than taxes for this respondent, but not by how much. Another example of a ranking scale is paired comparison. Paired comparison questions hide the fact that they are ranking scales. Consider the following question: Which of the two fast food outlets do you go to more often, Swiss Chalet or Chicken Deli? Although similar to a dichotomous scale, this is indeed a ranking scale because the consumer is implicitly asked to rank the two fast food outlets in terms of visit frequency. WHEN TO USE RANKING SCALES Ranking scales are useful when we are interested in knowing the most important issue(s) relevant to customers. For instance, just before the election, a political party may be interested in knowing the issue that is the most important to voters: Is it the economy? Is it leadership? Is it the social policy? The fact that an issue is considered “most important” to voters is of utmost importance to the party. If the economy is considered by the voter as the most important issue, then this is the issue that the party may want to concentrate on as an election issue. Similarly, in consumer research, a manufacturer may be interested in knowing how consumers rank factors such as price, quality, and service. Or the manufacturer may simply want to know whether the competition is perceived to be better or worse on different attributes. A ranking scale is assumed to be easier for consumers to use because it is more in line with how people think. For instance, people tend to think of a brand as the most preferred, not as a 6 on a 7-point scale. However, this assumes that they are not asked to rank too many products. Ranking is not effective when there are too many alternatives, especially when the survey is conducted over the phone.

RATING SCALES In a ranking scale, there is no information on the “distance” between one rank and another. For instance, if Bob says that his first-ranked brand is Tim Hortons donut and second-ranked brand is Country Style donut, we do not know whether Bob would readily accept a Country Style donut if a Tim Hortons donut is not available, or if he would only accept it very reluctantly. In other words, Country Style can be a close second or a distant second. To overcome this limitation, rating scales are used. Suppose we ask the respondent to rate Tim Hortons and Country Style on a 10-point scale (10 = most preferred; 1 = least preferred). Thus, a respondent who rates Tim Hortons as a 10 and Country Style as a 9 considers Country Style to be a close second; a respondent who rates Tim Hortons as a 10 and Country Style as a 3 considers Country Style to be a How to Measure Attitudes, Behaviour, and Traits

| 235

Ranking scales ask the respondents to state their order of preference but not necessarily their intensity.

Rating scales provide information on distance between scale points.

distant second. In other words, rating scales, unlike ranking scales, provide information on distance between scale points. Any scale that has three or more points is considered a “rating scale”. An interval scale is one in which the distance between any points on the scale are (or are assumed to be) the same. Consider the following examples: Example 1

$1

$2

$3

Here, the distance between $1 and $2 and the distance between $2 and $3 are the same: $1. Example 2

1. B.A

2. M.A.

3. Ph.D.

Here, can we truly say that the distance between B.A. and M.A. is the same as the distance between M.A. and Ph.D.? Probably not, in terms of prestige and the commitment needed. The distance between a B.A. and an M.A. can be very different from the distance between M.A. and Ph.D. Therefore, we cannot really say that these educational ranks are interval scaled. Consider the following example: On the following scale, please indicate how likely you are to buy a new car in the next 12 months. Very unlikely

1

2

3

4

5

6

7

8

9

10

Very likely

Can it be said that the difference between a rating of 1 and 3 is the same as the difference between 8 and 10? The answer here is not that clear. Although the difference in both cases is two points, those who gave ratings of 1 and 3 are unlikely to buy; those who gave a rating of 10 are much more likely to buy than those who gave a rating of 8. In other words, the distance between 1 and 3 can be very different from the distance between 8 and 10. However, interval scales allow us to do certain advanced statistical analysis on our data. So, by convention, a scale that has 5 or more points is considered to be “interval scaled.” Although this assumption is arbitrary, the statistical methods that we use can accommodate some deviations from what is strictly true. NUMERIC RATING SCALES As mentioned earlier, a numeric rating scale is one that measures the intensity of an attitude as a number. Such a scale can have any number of points. Scales with 5, 7, or 10 points are common, although other rating scales—such as scales with 100 points—are also possible. Identical questions can be worded to fit any of these scales. For example:

How would you rate this product on a 10-point scale, where 10 means Excellent and 1 means Poor? Excellent

10

9

8

7

6

5

4

3

2

1

Poor

How would you rate this product on a 7-point scale, where 7 means Excellent and 1 means Poor? Excellent

7

6

5

4

3

2

1

Poor

How would you rate this product on a 5-point scale, where 5 means Excellent and 1 means Poor? Excellent

5

4

3

2

1

Poor

If all of them are designed to capture the intensity of attitudes, which one should be used? Is there an ideal number of points for a scale in marketing research? There is no 236 |

CHAPTER 7

definitive answer to this question. It appears that different scales might be suitable in different contexts and no general guidelines can be given for the beginning researcher. However, certain observations can still be made: 1. Ratings on the same scale tend to be consistent over time. For instance, an insurance company may want to assess how customers evaluate its services on an ongoing basis. If they collect data consistently with the same scale (be it a 5point, a 7-point, or a 10-point scale), the results will have much greater comparability than if we change from one type of scale to another. 2. The 5-point scale is often used because it is possible to attach descriptive words such as “Very Good,” “Good,” “Average,” “Poor,” and “Very Poor” to the scale points (see the discussion on semantic scales below). This makes it easier to administer the question over the phone: “How do you evaluate the service you received— was it “Very Good,” “Good,” “Average,” “Poor,” or “Very Poor?” 3. The 10-point scale is also frequently favoured because it gives greater freedom to respondents to make finer distinctions. It is also easier to administer since a 10-point scale is intuitively understood (e.g., a “perfect 10”) by many respondents, perhaps because we use the decimal system in our daily lives. 4. We cannot assume that data collected using one scale can easily be mapped onto another scale. For example, if a respondent rates a product as a 4 on a 5point scale, we cannot translate this as an 8 on a 10-point scale. For this reason, when research is carried out periodically, it is better to use the same scale so that the results can be compared. Which way should the scales run? When numeric scales are used, you are free to assign scale values in either direction—ascending or descending: a. How would you rate this product on a 5-point scale, where 5 means Excellent and 1 means Poor? Excellent

5

4

3

2

1

Poor

b. How would you rate this product on a 5-point scale, where 1 means Excellent and 5 means Poor? Excellent

1

2

3

4

5

Poor

From a technical point of view, it does not make any difference in which direction the numbers go—ascending or descending. But it is good practice to assign higher numbers to more positive responses (as in “a.” above) because this is in line with the general conventions of communication. An average person is likely to interpret a rating of 4 on a 5-point scale as “better”than a rating of 2 on the same scale. By following the general convention, we can avoid our results being misunderstood. In any case, it is important to avoid different scales going in different directions in the same questionnaire. Issues with numeric rating scales. One major problem with numeric scales is that numbers do not have a clear meaning. For instance, if we use a 7-point scale, what does a 5 mean? Good or just a bit above average? If a person rates a product as a 5 on a 7point scale, how would that person rate the same product on a 10-point scale? There are no clear-cut answers to these questions. The second problem with numeric scales relates to the way the same scale may be interpreted by different respondents. Since numeric points do not have any intrinsic meaning, a 7 on a 10-point scale may be interpreted as “good” by some respondents and “above average” by others. The third problem has to do with cultural differences. While the meaning of scale points might vary from person to person, there may also be cultural differences in the way rating scale questions are How to Measure Attitudes, Behaviour, and Traits

| 237

Bipolar (positive/negative) rating scales use negative numbers to indicate negative opinion and positive numbers to indicate positive opinion.

answered. Certain cultural groups tend to give routinely higher ratings. This is often referred as the “politeness bias.” (For instance, many researchers have observed that in Canada, the average ratings for many products and services tend to be higher in Quebec compared to the rest of Canada.2) In addition to the straightforward numeric rating scales discussed above, there are also other kinds of numeric scales, some of which are described below. Bipolar (positive/negative) rating scales. When attitudes range from negative to positive, some researchers use negative numbers to indicate negative opinion and positive numbers to indicate positive opinion. For example, Some people believe that the current level of taxation in Canada is too high. How do you feel about this? Strongly Disagree

The constant sum method is a technique in which the respondent is asked to divide a given number of points among all alternatives.

–5

–4

–3

–2

–1

0

1

2

3

4

5

Strongly Agree

Scales like these are likely to be difficult to understand for many respondents because negative numbers are not often used in our daily lives. Many consumers may find it difficult to visualize the difference between –3 and +2. Another problem with bipolar scales that use negative numbers (also known as the Stapel scale) is that it is very difficult to administer over the telephone. For these reasons, unless there are special reasons, it is best not to use bipolar scales with negative numbers as scale points. Constant Sum Method. The constant sum method is a special type of numeric scale. Suppose consumers are asked to rate three products on a 10-point scale where 1 means “Do not like it at all” and 10 means “Like it a lot.” There is nothing that prevents the respondent from assigning 9 to each of the three products. This may indicate that the respondent likes all three products, but it does not tell us her or his preferences. It does not tell us what the consumer would do if she or he were to choose one when all alternatives are available, as is the case in real life. To obtain a better reading on customer attitude with regard to the three products, the constant sum method can be used. The constant sum method is a technique in which the respondent is asked to divide a given number of points among all alternatives. Here is an example: You have $10 to spend on soft drinks. I would like you to divide it among three soft drinks: Coke, Pepsi, and Dr. Pepper. How much would you spend on each soft drink depending on how much you like each one. You are free to assign the dollars any way you see fit. Coke Pepsi Dr. Pepper Total

______ ______ ______ $10.00

The constant sum method can be an effective method of “forcing”respondents to reveal their preferences. Constant sum is likely to be less effective as the number of alternatives increases—for example, it is much more difficult to divide $100 among 17 brands than it is to divide $10 among 3 brands. 2

Although the authors are not familiar with any published sources to support this statement, their overall experience with Canadian data from different studies over the years lends credence to this general belief, held on the basis of observation rather than on experimentation.

238 |

CHAPTER 7

SEMANTIC RATING SCALES Semantic rating scales use words instead of numbers to measure the strength of an attitude. For example:

How would you rate the quality of service that you received the last time you visited Bank A? Very good Good Average Poor Very Poor

Semantic rating scales use words instead of numbers to measure the strength of an attitude.

______ ______ ______ ______ ______

This scale uses descriptive words rather than numbers to describe scale points and is called a semantic scale. A semantic scale like this one is often assumed to approximate an interval scale, although it can be argued that the distance between “average” and “good” may not be the same as the difference between “average” and “poor.” When we treat a semantic scale as an interval scale (at the coding stage), we assign numeric scale values to verbal descriptors. For example, we can treat the above semantic scale as an interval scale by assigning numeric scale values, as follows: 5 4 3 2 1

= = = = =

Very good Good Average Poor Very Poor

However, remember that when we do this, the differences between scale points are assumed to be at least approximately the same. We can then make statements like, “Bank A received an average quality rating of 4.8 on a 5-point scale.” Where there is a serious violation of this assumption, numeric scale values cannot be assigned to the semantic scale. For instance, suppose our categories were as follows: Highest level of education

5. Ph.D. 4. Master’s 3. Bachelor’s 2. High School 1. < High School

______ ______ ______ ______ ______

We cannot assign numeric scale values to the categories because, for example, the distance between a Master’s and a Ph.D. degree cannot be judged—even approximately— to be equal to that of High School and Bachelor’s degrees. It does not make sense to say that the average level of education for people in Calgary is 2.9. Likert-type scale. The Likert scale is used in psychology to measure the strength of attitudes based on a battery of questions. In marketing research, we use the structure of the Likert scale but not necessarily the other methodological trappings that go along with it. Therefore, the Likert scale used in marketing research can be more appropriately called a “Likert-type” scale. Although the Likert scale is a semantic scale, the scale points are converted into numbers for computing averages. The Likert scale technique presents a set of attitude statements. Respondents are asked to express agreement or disagreement with each statement, usually on a 5-point scale. Each degree of agreement is given a numerical value from 1 to 5. For example:

How to Measure Attitudes, Behaviour, and Traits

| 239

The Likert scale technique presents a set of attitude statements. Respondents are asked to express agreement or disagreement with each statement, usually on a five-point scale.

Strongly Agree Not Disagree Strongly Agree Sure Disagree When you are on holiday, you worry about work. Money is no object when you are on holidays. Not taking a break from work makes you inefficient. Workholics are always stressed.

In semantic differential, the respondent is presented with a word such as a brand name of a car and presented with a variety of adjectives to describe it.

5

4

3

2

1

5

4

3

2

1

5 5

4 4

3 3

2 2

1 1

Likert-type scales are extensively used in marketing research, especially in measuring psychographics/lifestyle statements (discussed later in this chapter). Semantic differential. Semantic differential, like the Likert scale, has some methodolgical trappings that are not generally used in marketing research. So again, what we use in marketing research is a “semantic differential–type” scale. In semantic differential, the respondent is presented with a word (e.g., a brand name like Ford Escort) and presented with a variety of adjectives to describe it. The adjectives were presented at either end of a 7-point scale, ranging from, say, “good” to “bad” or from “fast” to “slow.” The respondents are asked to indicate where they would place a brand on the scale for each of the adjectives (see Exhibit 7-3). To use semantic differential, we need to show the scale to the respondent because of the visual nature of the scale. Since the bulk of marketing research interviews are currently being done over the phone, semantic differential is not used extensively in this field. With the advent of online research, the situation is likely to change because the scales can once again be shown visually on the screen. Issues with semantic scales. Semantic scales have a clear advantage over numeric scales in that each scale point has a specific meaning because a word is assigned to each of them. This is also a disadvantage. When administering the questionnaire over the phone, if we have more than 5 points on a semantic scale, it cannot be communicated easily without confusing the respondent. Compare the following: a. On a 10-point scale, where 10 means Extremely Efficient and 1 means Extremely Inefficient, how would you rate Air Canada Jetgo West Jet

____ ____ ____

b. How would you rate Air Canada? Would you say it is Extemely reliable Very reliable Reliable Mostly reliable Slightly reliable Slightly unreliable Mostly unreliable Unreliable Very unreliable Extremely unreliable

____ ____ ____ ____ ____ ____ ____ ____ ____ ____

Question a. is a numeric scale and can be asked over the phone without causing any confusion to the respondent. Question b. is a semantic scale. It cannot be answered unless the interviewer reads all the possible options available over the phone to the 240 |

CHAPTER 7

EXHIBIT 7-3

Semantic Differential

Good

Bad

Fresh

Stale

Strong

Weak

Hard

Soft

Hot

Cold

Active

Passive

Rough

Smooth

respondent. Even when the interviewer reads all alternatives over the phone, the chances are that the respondent would not remember all 10 categories. Moreover, the question has to be repeated all over again for Jetsgo and WestJet separately. The second issue with semantic scales has to do with having neutral midpoints, a problem semantic scales sometimes share with numeric scales. Consider this question: How would you rate the quality of service you received the last time you visited Bank A? Very good Good Average Poor Very Poor

____ ____ ____ ____ ____

This question has a neutral midpoint (average), which is neither positive nor negative. Some researchers believe that the presence of a neutral midpoint encourages some respondents to choose it indiscriminately because it requires the least effort. If we eliminate the midpoint and give the respondents only four choices, they are forced to choose either the positive adjective or the negative adjective. Very good Good Poor Very Poor

____ ____ ____ ____

This argument is plausible, and removing the neutral midpoint is likely to force people to reveal their true attitudes, provided they are leaning one way or another (positive or negative). But what happens when respondents have no genuinely positive or negative attitude with regard to the issue at hand? In such cases, it has been found that when people do not have genuine opinions, they tend to choose a positive response rather than a negative one. (See “The Art of Scale Development” in Marketing Research, Vol.15, No. 3, Fall 2003, pages 10–29; also refer to the BackTalk section in the Winter 2003 issue of the same publication.) In other words, if a respondent is forced to choose between “good” and How to Measure Attitudes, Behaviour, and Traits

| 241

RESEARCH IN PRACTICE

What Makes a Scale Good? A good scale should have the following characteristics: ● ● ● ● ● ●

minimal response bias respondent interpretation and understanding discriminating power ease of administration ease of use by respondents credibility and usefulness of results

Minimal Response Bias Positive response bias: Most people want to be “nice,” tending to select the less-critical judgmental position. Consider the 4-satisfaction scale, where many “satisfied” customers subsequently give negative comments in their verbatims, leading service providers to misinterpret their competitive position. Service providers may be at risk of losing these mildly disappointed customers as they are less likely to take the initiative to voice concern and may quietly seek another supplier. Adding a neutral or “politely negative” category (e.g., the 5-satisfaction and 5n-satisfaction scales) shifts response away from “satisfied” with little effect on the two “dissatisfied” categories. Some researchers argue against including a midpoint or neutral category, forcing a respondent to take sides. However, with judgmental or performance assessments, forcing sides usually results in a positive response. Endpoint bias: Some respondents avoid end-points in a scale. This phenomenon further exaggerates positive response bias and discourages the use of a 2-point scale. More positive responses result with the 2-satisfaction vs. the 4-satisfaction scale, for example. Although a 2-point scale or a check-off is sometimes effective when simplicity is required, it must be very carefully tested.

Respondent Interpretation and Understanding Respondents’ interpretation of the scales’ categories should not be assumed; colloquial use, as well as dictionary mean-

ings, must be taken into account. For example, consider one of the most commonly used scales, 4-excellence. As Stanley Payne noted as early as 1951, “fair” has conflicting colloquial uses. “Fair” weather implies a positive meaning, and a “fair” ruling implies some form of neutrality. Interpretation is further confounded by geographic differences; for example, in some pockets of the South, “fair” is better than “good,” whereas it tends to have a mediocre connotation in the Northeast and elsewhere. In addition, some researchers discard verbally anchored scales in favor of numerical or lettered scales. Unfortunately, they face similar problems in interpretation. Which categories are positive? An adequate grade to one person might be a B or a 70% (7), but to another it is a C or 50% (5). Further, some people are not numerically or spatially oriented; for them, translating thoughts and feelings to a number line is a difficult and perhaps unreliable task. Similar arguments can be made against purely verbal scales for those who are more numerically oriented.

Discriminating Power The ability to discriminate between degrees of customer opinion can help distinguish which service levels are poor vs. adequate vs. exceptional/desirable. Subsequently, this may lead to differentiating a competitive vendor from favouredpartnership status. The latter means the supplier is protected from a competitive threat. If diverse service experiences translate to one response category, information is lost. More scale categories are not necessarily better, but at least three points should be used. For example, a summary of responses to questions with a 10-point response scale may show all categories used, but this does not mean that a refined gradient in performance has been assessed. If each respondent uses no more than five scale categories, this suggests that respondents—when faced with an excessive number of categories—are defining a subset of the scale. Another source of response error is introduced when using verbal scales on the telephone: People can’t remember the choices if there are too many of them.

“poor,” he or she would choose “good” rather than “poor.” This means that leaving out a legitimate midpoint can potentially inflate the score. To counter this possibility, some researchers assign more positive points on a scale than negative points. Graphic rating scales use pictures as scale points.

GRAPHIC RATING SCALE Graphic rating scales use neither numbers nor words to describe scale points. Graphic rating scales use pictures as scale points. For instance: 242 |

CHAPTER 7

Ease of Administration The same scale may not work equally well on paper and in a telephone interview. For example, the midpoint of the 5nsatisfaction scale, “neither satisfied nor dissatisfied,” is effective on paper, but respondents avoid it in a telephone interview, thus distorting the meaning of surrounding categories. The order in which a scale is presented can help counterbalance positive response bias. As respondents tend to check off the first category that seems relevant on a paper survey, placing negative categories first (to the left) helps cancel out the tendency to respond positively. Since respondents on the telephone may favour the later categories heard, similar reasoning suggests reading a scale from positive to negative. Thus, a scale’s order, position, and presentation all can affect its delivery.

Credibility and Usefulness of Results The primary goal of measurement is to gather useful information. If employees continually challenge the interpretation of a scale—even without justification—then energies are redirected away from quality improvement. This does not mean that employees should dictate scale categories. But researchers may need to reject or replace a scale if the obstacles are too great.

Selected Scales 2-satisfaction: This scale fails response bias and discriminating power criteria. However, combining a strong scale for overall assessments with check-off lists for identifying detailed problem areas has proven to be an effective compromise when trying to control the time commitment required for long surveys that may be needed to assess complex service offerings. 4-satisfaction or 4-excellence: Both fail because of positive response bias and weak discriminating power. 5-satisfaction or 5n-satisfaction: Both address positive response bias, but fare less well than excellence, expectations, and requirements scales in discriminating high-end performance. 5n-satisfaction also fails in telephone delivery.

5-excellence: This type of scale scores well in all criteria; however, it is suboptimal in discriminating power because of a tendency for response distribution to be more skewed to the positive end of the scale when compared to the expectations or requirements scales. This scale is favoured when expectations aren’t likely to be well-formed in customers’ minds prior to the product/service experience. Robert Westbrook’s 7-point “delighted” to “terrible” scale is another alternative for a paper survey, but not for a telephone survey. Grade: This suffers from inconsistency in interpretation across respondents and company employees. Number: This also suffers from inconsistency of interpretation. In addition, it has too many categories. A smaller number of categories with phrases to anchor mid- and endpoints may be useful on a paper survey to address both numerical/spatial and verbal orientations. However, this cannot be accomplished on the telephone, and the verbal anchors need to be semantically assessed as with the other verbal scales which we have described. 5-expectations and 4-requirements: These two scales are strong on all criteria. In both extensive research and diverse practical applications, they have led to the most detailed assessments. We assess these scales together because neither is clearly favoured. A comparison of their tradeoffs is useful in demonstrating that there is no optimal or best scale. Both scales counteract response bias well. “Just as expected” and “met” are nearly always clear, positive responses. However, the 4-requirements scale has a slight edge as the “nearly met” category serves as a “politely negative” response exceptionally well. Both scales work well in telephone and paper delivery. However, the 5-expectations scale may require more repeating in a telephone interview to be comfortably used but not to the point of being a deterrent. Both scales have been well-received in companies where they have been introduced because they link measures to current definitions and philosophy about total quality management. Reprinted with permission from Marketing Research, published by the American Marketing Association, Susan J. Devlin, H. K. Don and Marrbue Brown, Vol. 15, No. 3, Fall 2003.

How would you rate your level of satisfaction with our service the last time you were in our store? (Choose a picture that matches your satisfaction level.) Similar to semantic differential, the graphic rating scale is visual and therefore is used infrequently in marketing research.

How to Measure Attitudes, Behaviour, and Traits

| 243

EXHIBIT 7-4

Happy Face Scale Don't like it at all

Like it very much

WHEN TO USE DIFFERENT RATING SCALES Numeric rating scales. Numeric rating scales have several advantages. Despite the fact that numeric points do not have consistent interpretation and different consumers tend to interpret them differently, rating scales still have many advantages. They are versatile in that, overall, they give consistent results across time periods. They lend themselves to statistical analysis and modelling. They are easy to communicate and to understand. Because of these advantages, numeric rating scales are widely used in marketing research. Most commonly, they are used to evaluate product and service attributes and to measure strength of opinions Semantic rating scales. Semantic rating scales, when they can be used as a substitute for numeric rating scales, offer some advantages. The most important of these is that each scale point—such as “Excellent,” “Good,” “Poor,” etc.—has a definite meaning attached to it. Another benefit is that we do not have to explain the scale to the respondent. However, semantic scales have their own problems. Apart from their not being suitable for telephone interviews, when more than a 5-point scale is desired, the midpoint in semantic scales can be problematic. Consider the following example: a. How would you rate the quality of this product overall? Would you say it is . . . Very Good Good Neither good nor poor Poor Very Poor

____ ____ ____ ____ ____

b. How would you rate this product overall? Would you say it is . . . Very Good Good Fair Poor Very Poor

____ ____ ____ ____ ____

In a., the midpoint selection has four words (Neither good nor poor). When it is read over the phone along with the other scale points, it can be somewhat confusing to the respondent. You can confirm this by reading the question aloud: How would you rate the quality of this product overall? Would you say it is Very Good, Good, Neither Good nor Poor, Poor, or Very Poor. In question b., the word “Fair” is easy to read out, but the word does not have a consistent meaning. In some parts of the country, this word could mean acceptable, while in other parts, it could mean mediocre. So while it is generally preferable to use semantic scales wherever they can be conveniently substituted for numeric scales, we need to consider how easily they can be administered and whether the words used in the scale have 244 |

CHAPTER 7

consistent meaning to all respondents. Semantic scales are particularly suited for measuring psychographic or lifestyle attributes (discussed later in this chapter). The constant sum method is not as widely used as either of the above scale types. However, it can be a very useful method to assess overall preferences, provided the number of brands rated does not exceed three or four. This technique is particularly helpful when we want to “force”the respondents to choose. We can of course ask them to rank their preferences, but ranking does not tell us how much they prefer one brand over another. A graphic rating scale is only occasionally used in marketing research.

How to Assess Behaviour The term behaviour refers to what people do rather than what they think. Sometimes we ask direct questions about behaviour (e.g., “How much did you spend on clothing last time you went out shopping?”) and at other times we ask indirect questions (e.g., “Do you think you spend too much on your clothing?”). The actual question should be influenced by research objectives.

MEASURING GENERAL BEHAVIOUR Reliability of behavioural measurement is central to marketing research: What do consumers buy? When do they buy? Where do they buy? How do they buy? How often do they buy? How much do they buy? Answers to questions like these provide rich information to the marketer. Let’s start with a problem to illustrate the issues involved in measuring behaviour. We want to estimate the number of meals that Canadians eat in restaurants. We may start with two assumptions: 1. The frequency with which a person eats in a restaurant will vary from person to person; and 2. Each person will have an average frequency of eating out that is likely to stay consistent over a period of time. Consumer A might eat out every other day, Consumer B twice a month, Consumer C once a week, and so on. The assumption that each consumer has a characteristic behaviour frequency (also known as the rate or the base rate) is an important one in marketing. This is what drives market share and is often the target of marketing efforts. However, how exactly do we measure behaviour such as frequency of eating out? There are at least three ways of asking the question: as a rating question, as a manifest behaviour question, or as a latent behaviour question. For example: a. As a rating question: How frequently do you eat out? Very frequently Somewhat frequently Somewhat infrequently Very infrequently

____ ____ ____ ____

b. As a latent variable question: On an average month, how often do you eat out? Average number of times per month

____

c. As a manifest variable question: How many times did you eat out last month? Number of times last month

____ How to Measure Attitudes, Behaviour, and Traits

| 245

PRACTISING RESEARCHER INTERVIEWS

Questions and Questionnaires An Interview with Darrell Bricker President and COO, Public Affairs for North America, Ipsos-Reid

In your view, does writing screening questions require skill? Why so? Writing screening questions definitely requires skill. This is because the researcher needs to, with an economy of words, ask the precise question(s) necessary to qualify the right respondents for a study. Why is this important? Well, for two reasons. The first is because time is money. Screening questions must be as short as possible so no expensive interviewer time is wasted on people who don’t qualify for the study. And, secondly, the questions must effectively screen out false positives and false negatives. So, the screen must let in those who you want to talk to, and keep out those who you don’t want to talk to. In marketing research, how crucial is measuring attitudes and behaviour correctly? It is the entire purpose of what we’re doing. I often

describe opinion and consumer research as being like a wind tunnel or a simulation for marketers. A survey that properly measures public attitudes and potential behaviours helps marketers test fly their products before they launch them into the market place. And, an effective wind tunnel points out the design flaws in a product so they can be modified prior to a final launch. If the attitudes and behaviours that constitute the simulation are measured incorrectly, then the simulation is worse than useless. Can you give a couple of examples of how things can go wrong when an incorrect attitude or behaviour question is asked? The biggest problem I see in attitude and behaviour measurement is faulty questionnaire design. Why? Because researchers and their clients can get much too close to the subject they’re examining. A good example of this is the question public sector clients always want to ask—should “x” be a federal, provincial, or a municipal responsibility? Trust me, most people don’t know and really don’t care, as long as someone deals with the issue in question. The best way to combat this problem is by following

Question a., “How frequently do you eat out?” is the weakest of the three if we are trying to estimate how often a person actually does eat out. There are two problems with this question. First, what is meant by frequently? Does it mean, every day, every other day, once a week, or once a month? Second, does frequently mean the same thing to every respondent? Consider two respondents, Jane and Bill. Both eat out once a week. For Jane very frequently means more often than once a week, while for Bill very frequently means every week. Although one person eats out as frequently as the other, Jane may choose somewhat frequently while Bill may choose very frequently. Does that mean we should avoid asking such questions? Not necessarily. Questions of this nature are useful in understanding how consumers view themselves and their habits. In fact, Question a. is an attitudinal question. It tells us what consumers think of their behaviour rather than giving an exact reading of the behaviour itself. Question b. assumes that there is a “characteristic rate” associated with behaviour. This is a reasonable assumption—some may eat out once a month, some twice a month, and so on. In mathematical modelling, this characteristic rate of behaviour is called the 246 |

CHAPTER 7

what I call the “Mother-in-Law in Cambridge rule.” If your mother-in-law in Cambridge doesn’t understand or care about the question you’re asking, then you probably shouldn’t be asking it. The other big problem I see is researchers who are enthralled by their analytical techniques. We too often forget that the first rule of social science is parsimony—the simplest solution is always the best solution. If a yes/no question works best (and many times it does) ask it. Sure, I appreciate the power of multivariate analysis, and know it needs scales (the more variance the better), but too often I find people using complicated questions simply because their analytical techniques require them. Sometimes when all you have is a hammer every problem starts to look like a nail. In your view, do attitudes really predict behaviour? Sure they do. The best example I can give of this is political research that does an excellent job of accurately reflecting how people will vote. How much can we rely on reported behaviour of consumers? If you’ve done a good job of designing the research, satisfying the conditions of reliability and validity, you will get the right answer. Why? BECAUSE SOCIAL SCIENCE WORKS! I’ve seen enough good and bad research over the last twenty years to know that coming up with the right answer is rarely a fluke. It is the product of a diligent researcher practising their craft.

If we cannot rely on consumers’ reported behaviour, what is the use of collecting it? According to George Gallup Jr. the single best test of the accuracy of a survey methodology is its ability to accurately predict election outcomes. This is because elections are one of the only broad-based occurrences we have in which a survey result relates to a real-world event in public behaviour. All one has to do is review the record of the major polling companies in Canada over the last several years to see how unquestionably accurate they’ve been in predicting election outcomes. They must be doing something right. Does it make any difference what kind of scales we use to collect data? What are your thoughts on scales? This is one of the most over-wrought discussions in the survey research business. But, after many years of dealing with this issue, and trying many different scales, I’ve concluded that the simpler the scale the better. In fact, for my money, the best scale for most research problems is a five-point (i.e. do you strongly agree, somewhat agree, neither agree/disagree, somewhat disagree, or strongly disagree with “x”). Why? It has a natural midpoint, it is easy to administer because you can read it as words rather than numbers, and it allows for enough variance to facilitate multivariate analysis. Also, when you’re reporting results to a client, they generally want to know who is with them and who is against them; mean scores or “to boxes” don’t communicate with the same power.

latent variable.3 Latent variables are difficult to measure directly. Consider the example in the following question: In an average month, how often do you eat out? The question may be interpreted differently by different people: ●





3

Some may think back to the recent past (previous month or so) and on that basis may estimate their average rate; Some may think back to the previous 12 months or so and on that basis may estimate their average rate; Some may think back to the previous 12 months or so, but exclude December (i.e., not an “average “month” because of the holiday season) and on that basis may estimate their average rate;

The term latent variable is also used in attitudinal measurement to descibe constructs (e.g., customer satisfaction or customer loyalty) that cannot be directly measured. They are generally derived with the use of advanced statistical techniques such as factor anlaysis and structural equation modelling. How to Measure Attitudes, Behaviour, and Traits

| 247

RESEARCH IN PRACTICE

The Biggest Error in Marketing Research Is Virtually Never Mentioned The biggest error in marketing research is virtually never mentioned. Never in the proposal for the study, never in the report, and never in any discussion of the findings. And what is this error? It’s asking the wrong questions (which includes incorrectly wording the right questions), and therefore not finding out what you really wanted to know, but more importantly not even knowing you did not find it out. And going further, it’s proceeding to draw the wrong conclusions because you asked the questions the wrong way—and sometimes to the wrong people, but we will get to that later. The proper wording of the question, if not the most important part of the research project, is at least as important as every other part. Perhaps the most famous example of asking the question in the wrong way was a study a few years ago, which set out to provide the definitive answer on whether or not people believe the Holocaust really occurred. Findings from this study received a high degree of publicity, because the study seemingly showed that 33% of the American public did not believe the Holocaust actually happened. Is this possible: that one in three think it never happened? No, that’s not possible. What really happened was that the question was phrased in such a misleading and confusing way that many people who felt one way answered in the

opposite way. See if you can see which question might give the wrong answer: Version A. Does it seem possible to you that the Nazi extermination of the Jews never happened? Answer: 33% said it was possible it never happened. Version B. Do you doubt that the Holocaust actually happened, or not? Answer: 9% said they doubted it. As should be clear, Version A used a double negative. Most of those who said “yes it is possible” were actually meaning, “yes, it did happen,” but in this case the “yes it’s possible” answer really meant “no it did not happen.” To say that the Holocaust happened required someone to say, “it’s impossible it never happened.” But there is still a third way to ask the question to make it even clearer and which shows even a lower disbelief. ”In your opinion, did the Holocaust definitely happen, probably happen, probably not happen, or definitely not happen?” Answer: 4% said it probably or definitely did not happen. One issue under investigation was asked three different ways, providing three different answers. Take your pick.

Some may look forward to what they intend to do in the future and answer accordingly; ● Some respondents may select a month that they think is “average” and report what they did in that month; ● Many respondents will mentally round the numbers. For example, a respondent who eats out 20 times a year is more likely to say twice a month rather than 1.67 times per month. So, Question b. is also an attitudinal question with different people interpreting the same question differently, although b. gives answers that are easier to quantify than a. Question c. is a true behavioural question. Here, all respondents are asked to give an answer to this question: ●

How many times did you eat out last month? 248 |

CHAPTER 7

All good researchers know that how you ask the question frames the answer. After all, that’s how much litigation research is conducted. If you want to prove a certain answer, you ask the question one way. But if you are on the other side, you ask it another way, because you know how the answer will turn out. (And then you just hope you are a better researcher for one side than your researcher friend is for the other side.) That’s one reason litigation research is so difficult: you have to design a study for one side that, if you were on the other side, you couldn’t critique. It is also important to avoid hinting at the possible answer to the question by how the question is phrased and to avoid leading the respondent to give an answer you want to hear. Sometimes this is done by posing an issue to the respondent, then asking about it. Here is a real example, which I have disguised slightly: ”If you knew that the biggest threat to the environment is global warming, how much more, per month, in state or local taxes would you be willing to pay to solve this problem?” Results from this study showed (surprise!) that people wanted to pay more taxes to clean up the environment and that was nice because an environmental group paid for the study. But there are at least five things wrong with this question. First, it makes a statement that cannot be proven, or at least which many people would disagree with, that global warming is the biggest threat to the environment that presumably must be solved in some fashion. Second, it attempts to “educate” the respondent by

stating a presumed problem and then following that with only one solution to the problem. Third, it assumes that people will want to pay more taxes (the question did not ask if they wanted to pay more, it simply said how much more), so it is easier to give an answer rather than say zero. Fourth, the answer’s area is taken at face value. Just because someone says they will pay $11–15 per month in additional taxes does not mean they actually would do so. Fifth, it assumes that paying more taxes is the way to solve the problem. Just because one might pay $15 more per month does not mean the problem will be solved, and, in fact, there is no promise in the question that it would be solved anyway. What if the question had been framed differently? Suppose it went something like this: ”Some people say they would be willing to pay as much as $20 per month more in taxes to solve global warming, and others say that there is already enough waste in government spending to solve the problem without a tax increase. Which would you prefer: an additional $20 per month in new taxes, or spending $20 of your taxes more wisely?” Does anyone doubt what the answer would be to that question? The point is, you can get any answer you want by how you ask the question. But, of course, as the published survey report on global warming went on to say, the accuracy of the sample was plus/minus 4.5%. Source: Nelems, Jim (2002) The Secret Rules of Successful Marketing. Longstreet Press: Atlanta, GA. pp. 125-131.

The time frame is precisely defined in the question and the respondent is not asked to estimate the rate but to report on what he or she actually did. For this reason, the variable being measured is called the manifest variable. In measuring behaviour questions, type c. should be preferred.4 In fact, in measuring newspaper or magazine readership studies and in panel studies, manifest variables are used almost exclusively. While this is a better way to ask a behavioural question, we should be aware of the problems associated with this type of question as well. For example, we want to estimate the frequency of drinking. If we do our survey in January and ask the respondent, “How many drinks did you have last month?”, we are likely to overestimate drinking because it is likely that

4

If we want to determine the underlying rate of consumption, we can do so by applying mathematical models to the manifest behaviour question. How to Measure Attitudes, Behaviour, and Traits

| 249

many people tend to drink more during the holiday season than at other times of the year. Similarly, if we ask in February, “How many times did you eat out last month?”, we may underestimate the frequency of eating out since people may go out less immediately following the holiday season. Of the three, question c. is the best behavioural question. However, we should make sure that there are no special factors affecting the outcome. Thus, behaviour can be measured as an attitudinal variable (least precise), as a latent variable, or as a manifest variable (most precise). Each type of measurement has its own uses. In writing a questionnaire, we should first determine how the information will be used and ask the question accordingly. To give an example, if we want to know how Canadians perceive themselves with relation to alcohol, we can ask: How frequently do you have alcoholic drinks? Very frequently Somewhat frequently Somewhat infrequently Very infrequently

____ ____ ____ ____

If our aim is to know how much people actually drink, a question like the following would provide more useful answers. How many drinks did you have last month, either at home or elsewhere?

MEASURING SENSITIVE QUESTIONS ABOUT BEHAVIOUR Sometimes we may need to ask questions that are generally considered “personal,” such as total net worth, reasons for divorce, frequency of having sex, etc. It is possible to elicit sensitive personal information in certain contexts. The first context is when the interviewer has established rapport with the respondent. A certain amount of rapport is generally created by the simple act of talking. Because of this, it is always preferable not to ask sensitive questions at the beginning of an interview. The second context is perceived relevance. An abrupt question like, “Do you have or does anyone in your family have cancer?” is likely to encounter greater resistance than the same question placed in a relevant context, “We are doing a survey among Canadians to assess the incidence of cancer on behalf of the Cancer Society. In this connection, may I ask if you or anyone in your family has cancer?” A third context is obtained when we make the sensitive question a part of a general inquiry as follows: We are doing a survey among Canadians to assess the incidence of different ailments. In this connection, may I ask if you or anyone in your family has Sleep Problems Cold/cough Blood pressure Cancer Back pain

____ Yes ____ Yes ____ Yes ____ Yes ____ Yes

____ No ____ No ____ No ____ No ____ No

Yet another way of handling the problem is to broaden the scope of the question: We are doing a survey among Canadians to assess the incidence of cancer. Do you personally know of anyone who suffers from cancer? In summary, behaviour measurement in marketing research is best described as measurement of self-reported behaviour. We assume that people tend to provide a fairly accurate description of their behaviour as long as the questions are easy to answer and

250 |

CHAPTER 7

do not involve sensitive issues. However, we should remember that there may be differences between actual behaviour and self-reported behaviour. Modern measurement techniques (e.g. scanner data at retail outlets or computer tracking of websites visited) enable us to directly measure consumer behaviour. Perhaps, in the future, most behaviour will be measured directly by such means. For now, the bulk of what we call behaviour is self-reported behaviour. Our aim, in designing a questionnaire to measure behaviour, is to make sure that self-reported behaviour is as close to actual behaviour as possible.

Lifestyle Measurement A commonly accepted model is that attitude influences behaviour. For example, if consumers like a product (attitude), they buy it (behaviour). If they do not like it (attitude), they do not buy it (behaviour). It is for this reason that attitude measurement is given a central role in marketing research. However, an attitudinal statement by itself is not very insightful. For example, a statement such as, “I prefer to rent a movie than watch it in a theatre” tells us very little about the person and provides only limited information to the marketer. A person may prefer to rent because she or he (a) does not want to spend the extra money to go to a theatre; (b) prefers to watch movies at home; and (c) prefers to watch with breaks in between. We can of course ask the respondent why they rent a movie. But doing so is less likely to elicit answers such as, “I cannot really afford to go to the movies in theatres,” even when they are the real reasons because they may be perceived by the respondent as showing themselves in a poor light. Another reason why direct questions may be less helpful is that a direct question has limited explanatory power. For instance, “prefers to watch it at home” does not really tell us whether it is because the respondent is a homebody or whether the theatres are not convenient to get to in that person’s neighbourhood. In many instances, we may need to go no further than simply asking an attitudinal question, followed perhaps by a question about the reason for holding that attitude. However, there are instances where we need to understand why a consumer does what he or she does. This is commonly done to group consumers who have similar underlying patterns of attitudes, especially with relation to a product category. The process of grouping consumers based on how similar they are on a number of attributes is known as market segmentation. One can segment the market based on any set of relevant characteristics such as demographic similarities (e.g., older, wealthier, better-educated consumers), similarities of benefit sought from a product (e.g., those who buy cars purely to get from point A to point B), or lifestyle similarities. Lifestyle or “psychographics” refers to an assorted combination of a person’s selfreported behaviour (e.g. I go to movies often), self-reported personality traits (e.g., I’m easily excited), and attitudes (e.g., I always plan for my future). Usually the respondent is asked to what extent she or he agrees with a number of statements. Although there are some theoretical models for measuring lifestyle, lifestyle questions are constructed by the researcher, who takes into account the product in question and the relevant motivational and personality traits that may be related to it. Here is an example of a few lifestyle questions that deal with organic foods: Please indicate whether you agree strongly, agree somewhat, disagree somewhat, or disagree strongly with each of the following statements regarding organic food products. Please note that there are no right or wrong answers. We would just like to know how you feel about these statements.

How to Measure Attitudes, Behaviour, and Traits

| 251

Lifestyle or “psychographics” refers to a person’s selfreported behaviour, personality traits, and related attitudes.

Agree Agree Disagree Disagree Strongly Somewhat Somewhat Strongly 1. I am willing to pay a little more for organic food. 2. Organic produce does not look as appealing as nonorganic produce. 3. If there were more organic food choices available, I would buy more organic products. 4. I find that organic produce is often not as fresh or long-lasting as nonorganic produce. 5. Organic produce tastes better than regular produce. 6. There are certain items for which I always buy organic. 7. I do not understand why organic food usually costs more than nonorganic products. 8. In the future, I expect to purchase organic products more frequently than I currently do.

































































When we analyze a battery of lifestyle questions, we are not interested in knowing how respondents answered any particular question. Rather, we are interested in the pattern of their responses. A statistical technique known as cluster analysis is designed to identify people who answer many questions similarly. For instance, consumers who disagree with question 1 but agree with question 7 may be identified by cluster analysis as belonging to the same segment. After examining the statements, we can conclude that consumers in this segment are price sensitive. (Most lifestyle segmentation studies will contain a large number of statements, typically between 30 to 100.) Not all questionnaires will contain lifestyle statements. In general, only questionnaires that aim to segment the market on a lifestyle basis will contain these types of questions. If you are faced with writing a questionnaire aimed at producing a lifestyle battery of questions, consider the following: 1. Consider the product and relevant benefits of the product. For example, if the product is a car, the relevant benefits are transportation, impressing people, freedom to move, excitement of moving fast, etc. 2. Consider what might potentially influence the purchase of the product. For example, being practical, being concerned with other people’s opinions, concerned with freedom, concerned with excitement, etc. 3. Create statements to represent the list developed in question 2. 4. Consider doing a few focus groups. In a lifestyle battery of questions, even when we appear to measure behaviour, we are only measuring what the individual feels about the behaviour. For instance, consider a statement like the statement below. It does not really intend to measure whether the respondent would actually pay more or, if so, how much more. Instead, it tries to measure whether the respondent considers himself or herself as someone who would pay more.

252 |

CHAPTER 7

Agree Agree Disagree Disagree Strongly Somewhat Somewhat Strongly 1. I am willing to pay a little more for organic food.









Measuring Demographic Traits As we mentioned in the last chapter, the demographic section is usually placed at the end of the questionnaire. Demographic questions usually include age, gender, income, household composition, education, occupation, etc. Because these are personal questions, asking them upfront is likely to elicit suspicion. The only exception to this is when we use some demographics for screening purposes. For instance, if we are interested in interviewing only women who are between 18–45, we may make our purpose explicit and ask the following screening question at the beginning of the interview: Today, we are interviewing women who fall into certain age groups. Could you please tell us to which of the following group you belong? Under 18 yrs ____ 18–45 yrs ____ 45+ ____

Thank and terminate. Continue. Thank and terminate.

The demographic section of a questionnaire might look like this:

BASIC DEMOGRAPHICS Finally, we need some basic information about you in order to classify and interpret the data. This information is for tabulation purposes only and remains strictly confidential. Please answer all the questions. 1. a. With respect to your current marital status, are you... (Check one box only.)

Married Cohabiting with partner/living common-law Single Separated/Divorced/Widowed

■ ■ ■ ■

b. How many people in total, including yourself, are living in your household? ■ ■ ■ ■

One Two Three Four or more

2. How many of these people are children under 18 and how many are adults?

Adults Children

____ ____

3. In which of the following educational categories do you belong?

Some public school Completed public school Some high school Completed high school Some/completed technical school

■ ■ ■ ■ ■

How to Measure Attitudes, Behaviour, and Traits

| 253

A screening question identifies whether the person contacted is eligible to answer the questions.



Some/completed community college (CEGEP) Some university Completed university Graduate school

■ ■ ■

4. How many people in your household are employed full-time? ■ ■ ■ ■

None One Two Three or more

5. What is your occupation? _____________________________________ 6. What was the language that you first spoke in childhood and still understand?

(Record in Question 7.) 7. What is the language spoken in your home?

French English German Ukrainian Italian Chinese Polish Other

Question 6 Language First Spoken

Question 7 Language Spoken in the Home

■ ■ ■ ■ ■ ■ ■ ■

■ ■ ■ ■ ■ ■ ■ ■

8. In which of the following categories does your total annual household income,

before taxes, fall? Under $10 000 $10 000 – $19 999 $20 000 – $39 999 $40 000 – $59 999 $60 000 – $79 999 $90 000 – $99 999 $100 000 or more

■ ■ ■ ■ ■ ■ ■

9. Finally, which of the following age groups do you belong to?

18–24 yrs 25–34 45–54 55–64 65+

254 |

CHAPTER 7

■ ■ ■ ■ ■

We let the respondent know that is the last question to make sure that he/she doesn’t break off here thinking there are more questions to come. Every bit helps. Some questionnaires contain more demographic information than others. Some principles of writing demographic questions are as follows: ●





There is a general reluctance among many people to reveal their exact age or income. So where age and income are concerned, use categories as shown above. Even when you use categories, use as few as needed for your purposes. For instance, people are more likely to respond to an age question with fewer categories (under 35, 36–49, 50 and above) than questions with many categories (under 20, 20–29, 30–39, 40–49). The same is true of other sensitive questions such as those about income. People do not want to belong to the lowest category in variables like income. So it is sometimes good to include a low-income category to which most people are unlikely to belong. For instance: 10. In which of the following categories does your total annual household

income before taxes fall? Under $10 000 $10 000 – $19 999 $20 000 – $39 999 $40 000 – $59 999 $60 000 – $79 999 $90 000 – $99 999 $100 000 or more ●



■ ■ ■ ■ ■ ■ ■

Very few Canadian households have an income of less than $10 000. The inclusion of this category is mainly to make it palatable to the other lower-income categories. Unless we specifically need to know personal income, it is better to ask for a person’s household income because, in many cases, household income is larger than personal income and so is less embarrassing to reveal.

Although we ask demographic information, we should remember that many respondents might refuse to provide some or part of the information. Refusals can be particularly high for income-related questions. In the last chapter, we talked about writing questions that are clear and unambiguous. In this chapter, we have discussed how to design questions to elicit different types of information. Writing a questionnaire is both a science and an art. This means that although these two chapters will give a reasonable grounding in the science part of questionnaire design, as you gain experience, you will also note that the rules are not hard and fast. Once you understand the basic principles, the actual question, as well as the wording of it, depends on the context in which it is asked, the respondent from whom the information is needed, and the precision of the answer required.

How to Measure Attitudes, Behaviour, and Traits

| 255

SUMMARY 1. Screening questions identify whether the person we have contacted is eligible to answer the questions at all. They are usually asked at the beginning of the interview. On rare occasions, where asking the screening questions are likely to bias responses, screeners are asked at the end. 2. Warm-up questions are designed to establish rapport between the interviewer and the respondent. They should be short, easy to answer, and not of a personal nature. 3. Attitudes deal with consumer perceptions and judgments. Attitudes exist in the minds of consumers and cannot be directly observed. Attitudes are assumed to be relatively enduring (i.e., attitudes change slowly). 4. Writing questions to capture attitudes are subject to two considerations: a. Existence. Does a consumer hold a given attitude? b. Intensity. If so, how strongly does the consumer hold that attitude? 5. There is no one-to-one correspondence between attitude and behaviour. However, in general, some relationships are found to hold. a. A certain proportion of those who say they would buy a product will indeed buy it. b. This proportion may vary by product category. c. The intensity of attitudes is (ordinally) reflected in related behaviours. 6. Attitude measurement should be valid, reliable, and sensitive. a. Validity. Does this question (or battery of questions) measure what we want it to measure? b. Reliability. Will this question (or battery of questions) measure what we want it to measure consistently? c. Sensitivity. Is the measurement scale we used sensitive enough to detect the shade of meaning needed? 7. Dichotomous scales are used to assess the existence of an attitude. They usually lead to responses such as “Yes” or “No”. They are used

256 |

CHAPTER 7

a. when the attitude being measured is clear-cut. b. when the attitude being measured has shades of meaning but the researcher is not interested in them. c. as a prelude to asking the strength of attitudes.

8. In ranking scales, respondents are asked to state their order of preference but not necessarily the intensity. Ranking scales are useful when we are interested in knowing the most important issue(s) relevant to customers. 9. Rating scales are scales that, unlike ranking scales, provide information on distance between scale points. Any scale that has three or more points is considered a rating scale. An interval scale is one in which the distance between any points on the scale are (or are assumed to be) the same. A numeric rating scale is one that measures the intensity of an attitude as a number. a. Ratings on the same scale tend to be consistent over time. b. The 5-point scale is often used because it is possible to attach descriptive words to the scale points. c. The 10-point scale is also frequently favoured because it gives greater freedom to respondents to make finer distinctions. d. We cannot assume that data collected using one scale can easily be mapped onto another scale. 10. From a technical point of view, it does not make any difference which direction the numbers go— ascending or descending. But it is good practice to assign higher numbers to more positive responses because this is in line with the general conventions of communication. 11. One major problem with numeric scales is that numbers do not have a clear meaning. 12. Bipolar (positive/negative) rating scales assign negative numbers to indicate negative opinion and positive numbers to indicate positive opinion. 13. Constant sum method is a technique in which the respondent is asked to divide a given number of points among all alternatives.

14. Semantic scales use words instead of numbers to measure the strength of an attitude. 15. The Likert technique presents a set of attitude statements. Respondents are asked to express agreement or disagreement with each statement, usually on a 5-point scale. 16. In semantic differential, the respondent is presented with a word such as the brand name of a car and then presented with a variety of adjectives to describe it. The adjectives were presented at either end of a 7-point scale, ranging from, say, “good” to “bad” or from “fast” to “slow.” Respondents are asked to indicate where they would place a brand on the scale for each of the adjectives. 17. Graphic rating scales use neither numbers nor words to describe scale points. They use pictures to indicate the strength of an attitude. 18. Numeric rating scales are versatile and lend themselves to statistical analysis and modelling. They are easy to communicate and to understand. Because of these advantages, numeric rating scales are widely used in marketing research. 19. Semantic rating scales have meaningful scale points, but they are mostly unsuitable for telephone interviews. When more than a 5-point scale is desired, the semantic scale’s midpoint can be problematic.

20. The constant sum method is not as widely used as the last two types of scales. A graphic rating scale is only occasionally used in marketing research. 21. Behaviour measurement is central to marketing research. Behaviour can be measured as an attitudinal variable (least precise), as a latent variable, or as a manifest variable (most precise). Each type of measurement has its own uses. 22. It is possible to elicit sensitive personal information by creating a rapport with the respondent, and by providing a context for the question. Behaviour measurement in marketing research is best described as measurement of self-reported behaviour. 23. Lifestyle or “psychographics” refers to an assorted combination of a person’s self-reported behaviour (e.g. I go to movies often), self-reported personality traits (e.g., I’m easily excited), and attitudes (e.g., I always plan for my future). When we analyze a battery of lifestyle questions, we are not interested in knowing how they any particular question is answered, butwe are interested in the pattern of the responses. Not all questionnaires will contain lifestyle statements. 24. The demographic section is usually placed at the end of the questionnaire. Demographic questions usually include age, gender, income, household composition, education, occupation etc.

Key Terms attitudes, p. 232 bipolar (positive/negative) rating scales, p. 238 constant sum method, p. 238 dichotomous scales, p. 234

graphic rating scales, p. 242 lifestyle p. 251 Likert scale technique, p. 239

rating scales, p. 236 screening question, p. 253 semantic differential, p. 240

psychographics,

semantic rating scales, p. 239 warm-up questions, p. 231

ranking scales,

p. 251 p. 235

Review Questions 1.

Attitudes can be of several different types. Describe the three commonly used types of attitudes. Then give one example of each of these types of attitudes plus provide one well-written question designed to elicit this attitude from a respondent.

2.

Should an attitude scale contain an even or an odd number of points? Research this topic, provide your opinion, and justify your stance.

3.

What is the purpose of screening questions? Where are they typically placed in the questionnaire?

How to Measure Attitudes, Behaviour, and Traits

| 257

4.

Provide a basic definition of reliability along with an example.

9.

Provide examples of three lifestyle variables and develop questions to measure those variables.

5.

Provide a basic definition of validity along with an example. Now, contrast reliability and validity to show the differences.

10. Develop a short questionnaire to measure students’ attitudes toward your educational institution.

6.

When should ranking scales be used and when should they not be used?

11. Identify the type of scale used in the following question, which was part of an internet survey.

7.

When constructing a 5-point rating scale for satisfaction, which number should be assigned to “very satisfied?”

8.

Give two examples of manifest variables and two examples of latent variables.

12. Now, rewrite the question above in a totally different way with the intention of obtaining a ranking among the four golf courses with respect to enjoyment and also obtaining a measure of the relative intensity of respondents’ enjoyment among the golf courses.

Scale for Question 11 Please rate your enjoyment of the golf courses available for after-work rounds. Check the box to the left of each row to indicate those that you used in 2003. For each box checked,circle the number on the scale. Did not use it in 2003

Did not enjoy

Slightly enjoyed

Somewhat enjoyed

Fairly enjoyed

Enjoyed very much

Lionhead Monday













Angus Glen Tuesday













Hunters’ Glen Wednesday













Richmond Hill Thursday













MINI CASE

Commissioner of Competition v. Sears Canada Inc. (CT-2002/004) On July 22, 2002, Mr. Gaston Jorré, the Acting Commissioner of Competition, Competition Bureau, Industry Canada, filed a Notice of Application with the Competition Tribunal to investigate allegations against Sears Canada Inc. Find this application at www. ct-tc.gc.ca/english/cases/ct-2002-004/0001a.pdf, which is the website for the Competition Bureau. A brief excerpt from the grounds for the application appears below. STATEMENT OF GROUNDS AND MATERIAL FACTS I.

GROUNDS FOR APPLICATION

1.

The Commissioner states that, in connection with the promotion and sale of certain tires to the public as set out herein, Sears Canada Inc. (herinafter “Sears”) employed deceptive marketing practices which constituted “reviewable conduct” under subsection 74.01(3) of the Act.

258 |

CHAPTER 7

2.

Specifically, the Commissioner states that in 1999 Sears offered certain tires to the public at significantly inflated regular prices, and subsequently made specific references to those inflated regular prices when advertising those tires at sale prices. These advertisements contained “save” and “percentage off” representations which purported to be substantial discounts off Sears’ regular prices on tires. For example, Sears reg. $133.99, Sale each $72.49; and “Save 45% – Our Lowest Price of the Year”.

A brief excerpt from the Responding Statement from Sears Canada Inc., filed on September 18, 2002 (http://www.ct-tc.gc.ca/english/cases/ct-2002-004/0002a.pdf), appears below. TAKE NOTICE that Sears Canada Inc. (“Sears”) opposes the application aforesaid (the “Application”) made to the Competition Tribunal by The Commissioner of Competition (the “Commissioner”) on July 22, 2002, pursuant to subsection 74.01(3) of the Competition Act, R.S.C. 1985, c. C. 34, as amended (the “Act”), for certain relief pursuant to section 74.10 of the Act. AND TAKE NOTICE that in support of its opposition to the Application and for the relief requested herein, Sears relies on the following Responding Statement of Grounds and Material Facts. AND TAKE NOTICE that Sears intends to question the constitutional validity, applicability or effect of subsection 74.01(3) of the Act. AND TAKE FURTHER NOTICE that Sears requests the relief set out below, including a determination by the Competition Tribunal that subsection 74.01(3) of the Act is constitutionally invalid and of no force or effect by reason of its infringement of Sears fundamental freedom of commercial expression guaranteed by subsection 2(b) of the Canadian Charter of Rights and Freedoms, pursuant to subsections 8(1) and (2) of the Competition Tribunal Act, R.S.C. 1985, c. 19 (2nd Supp.), as amended.

Sears Canada Inc. was represented by Ogilvy Renault Barristers and Solicitors, Toronto, Ontario. Mr. Stephen Scholtz of Ogilvy Renault asked Dr. Ken Deal of McMaster University and marketPOWER research inc. to conduct a survey among customers of Sears Canada Inc., who had bought one or more of the tires in question during 1999. Dr. Deal’s affidavit, based on a survey conducted on this matter, was submitted on October 14, 2003, and can be found on the website of the Competition Bureau at www.ct-tc.gc.ca/english/cases/ct-2002-004/0078.pdf. All of the material on the website for the Competition Bureau is public information. Dr. Deal filed a supplementary affidavit on January 8, 2004, which can be found at www.ct-tc.gc.ca/english/cases/ct2002-004/0107.pdf. This latter affidavit deals with specific questions regarding the sampling for the survey. Many additional affidavits were filed in this proceeding and can be found on the Competition Bureau website. Your job is to scrutinize the components of the fieldwork represented in these affidavits. This fieldwork is representative of many commercial marketing research studies. Specifically, answer the questions below.

Case Questions 1) How was the relevant population defined in this study? 2) How was the sample frame obtained?

How to Measure Attitudes, Behaviour, and Traits

| 259

3) How was it determined if members of the sample frame and the relevant population were interviewed? 4) Identify the response rate, the refusal rate, and the incidence rate, and explain those rates technically as well as in simple language that all can understand. Consult the findings of the response rate committee of the PMRS for further comparisons. 5) How many attempts were made to contact respondents? What does this mean? How is this done in practice? 6) As you can see from the affidavit, the fieldwork was subcontracted to Opinion Search Inc. in Ottawa. Go to the Opinion Search website and determine the nature of VOXCO. What are the benefits of VOXCO? Where and when did VOXCO originate? Identify one other Canadian-developed CATI system and contrast these two approaches.

260 |

CHAPTER 7