Emotional responses to ads are a continuing source

Process Tracing of Emotional Responses to TV Ads: Revisiting the Warmth Monitor PIET VANDEN ABEELE DOUGLAS L. MACLACHLAN* Because of the transient nat...
Author: Thomas Malone
1 downloads 0 Views 1MB Size
Process Tracing of Emotional Responses to TV Ads: Revisiting the Warmth Monitor PIET VANDEN ABEELE DOUGLAS L. MACLACHLAN* Because of the transient nature of some emotions stimulated during TV commercials, measurement of emotional reactions at various pOints during an ad requires process tracing. This research discusses the analysis of process-tracing data using the Warmth Monitor as an illustration. We show that the establishment of the reliability and validity of process-tracing measures hinges on a suitable choice of the relevant domain of (co)variance in the data. The Warmth Monitor is shown to provide a reliable measure of warmth, but questions remain regarding the construct's meaning and valid measurement.

E

motional responses to ads are a continuing source of interest to academics and advertisers alike (e.g., Bagozzi 1988; Batra 1986; Batra and Ray 1986; Edell and Burke 1987; Holbrook and O'Shaughnessy 1984; Petty, Cacioppo, and Schumann 1983; and Zeitlin and Westwood 1986). They are distinct from other responses to ads in the following ways: (1) Emotions, like other affective responses, have a valence; they are positive or negative. (2) They are spontaneous and are largely triggered by outside stimuli before being further processed and modified by the individual. (3) They have experiential "content"; subjects are able to differentiate between a number of emotional directions or orientations. (4) Emotions are rather volatile; they respond to, and change with, variations in external conditions; their intensity can vary in the course of a single commercial. (5) Emotions have physiological arousal as a concomitant response. Typical of research on emotional ad response is the use of single, postexposure, verbal self-reports, as measurement procedures, and concurrent correlational analysis across subjects (i.e., using overall responses of an individual to a commercial as the basic observation unit). The type of stimulus, the type of measurement,

and the kind of analysis are susceptible to criticism, as we now discuss.

EMOTIONAL RESPONSES AND ADVERTISING RESEARCH

Type of Stimulus The relevant stimuli are not always single ads but could be aspects or segments of ads, especially for externally paced stimuli, such as broadcast commercials. Instead of overall reactions to ads, real-time responses to transitory elements of ads are of interest. The latter may reveal how the commercial is processed and hence contribute to a better understanding of its effectiveness (see, e.g., Boyd and Hughes 1992; Fenwick and Rice 1991; Hughes 1992; Rothschild et al. 1988; Stewart and Furse 1982).

Type of Measurement Measuring affective responses to advertising by verbal self-reports after the fact presents difficulties because of cognitive, verbal, and/or social barriers. That is, respondents may not know, may not be able to express verbally, or may not be willing to reveal their emotional reactions. Nonverbal and immediate response recording methods are preferable, especially when the aim is to measure volatile emotional reactions (Kroeber-Riel 1982). The tracing of emotional processing seems to be particularly relevant (1) where one wishes to learn whether the intensity of a given emotion varies over the course of a commercial; (2) where the over-time pattern of the response itself is of importance, for example, where one wants to see whether the commercial builds the intensity of a given emotion from low to high (or vice versa);

*Piet Vanden Abeele is professor of marketing in the Department of Applied Economic Sciences at Catholic University of Leuven, de Beriotstraat 34, B-3000 Leuven, Belgium. Douglas L. MacLachlan is professor of marketing, Department of Marketing and International Business, University of Washington, DJ-IO, Seattle, WA 98195. The authors wish to acknowledge the suggestions of John Lastovicka and substantial contributions of research assistants Luk Warlop, An Swinnen, and Koen de Vos, as well as the three reviewers. Particular gratitude is also expressed to ROGIL Field Research of Linden, Belgium, for providing the GSR hardware and software. The research was completed when the second author was on sabbatical leave at Catholic University of Leuven, 1991-1992.

586 © 1994 by JOURNAL OF CONSUMER RESEARCH, Inc.• Vol. 20. March 1994 NI rights reserved. 0093-5301/94/2004-0007$2.00

PROCESS TRACING OF EMOTIONAL RESPONSES

and/or (3) where the reaction to a segment of a commercial must be considered relative to another segment, for example, to decide which yields the most intense emotional response. For these purposes, a process-tracing methodology is required that registers responses continuously, or at regular intervals. Such a method will preferably be low in reactivity, nonverbal, immediate, and spontaneous, in addition to being reliable and valid. For many years, commercial testing services have provided services for dynamic recording of responses to TV ads and other broadcast programming (e.g., the Program Analyzer of Peterman [1940]). Such services are now made accessible and relatively inexpensive by the ease of recording, analysis, and display possible with modern equipment. Process-tracing methods can be classified into at least four categories: (1) continuous, proprioceptive (e.g., the Warmth Monitor of Aaker, Stayman, and Hagerty [1986]); (2) continuous, nonproprioceptive (e.g., the dial-turning method described by Hughes [1992]); (3) discontinuous, nonproprioceptive (e.g., push-button devices); and (4) autonomic (e.g., galvanic skin response [GSR] or electroencephalograph). The first category of device provides continuous, innate feedback of physiological response (e.g., self-perception of the location of one's arm). Dial-turning devices do not provide immediate self-feedback but often use visual feedback. Push-button devices require judgment (i.e., which button to push) in addition to visual control to see that the correct response is executed. Autonomic devices require no cognitive mediation in the recording of responses.

Type of Analysis Data generated through any continuous-time registration procedure can be analyzed internally or externally and with a focus on either static or dynamic views of the trace. By "internally" is meant the comparison of parameter estimates between any two or more segments of the trace, concurrently or sequentially. For example, one might be interested in viewer emotional response to early and later segments of a commercial. "External" analysis involves examining relationships between process-trace parameters and independent data that measure antecedent, concurrent, or subsequent ad responses. An example would be an attempt to relate process-trace parameters to purchasing behavior. "Dynamic" analysis involves examination of the shape of the trace and might require time-series analysis methods. Examples might be investigating whether different types of ads result in particular characteristically shaped traces or whether ad-sequence effects depend on the time order of presentation of the key message. Static analysis reduces to statistics that do not say anything about the shape of the trace (e.g., mean or variance of response for an ad or for an ad segment). With a few exceptions, such as Holbrook and Batra (1987), who measured various properties of ads on in-

587

dependent-respondent samples and conducted their research by analyzing the covariance of such properties across ads, most academic advertising research analysis follows an analysis strategy different from the one proposed here. The typical approach is to measure multiple concurrent responses of a subject to a commercial stimulus with a single method and to analyze correlations computed across subjects. It is well established that the correlational structure of multiple responses to a single stimulus computed across subjects (or the average of such correlations if multiple stimuli are involved) is not necessarily the same as the structure of the average responses computed across stimuli (Srinivasan, Vanden Abeele, and Butaye 1989). We contend that the relevant variances and covariances for advertising decision makers correspond to those between stimuli (or segments of ads) rather than those between subjects. This implies that the data matrix of interest for advertising decision making has commercial stimuli as rows, variables or traits as columns, and the (population- or sample-) average segment response of subjects as cell entries; that is, subjects are considered as replicates. Analyzing the "between" variances and covariances allows the researcher to measure different properties of the same stimuli on independent samples.

THE WARMTH CONCEPT AND THE WARMTH MONITOR Aaker et al. (1986) proposed a paper-and-pencil, selfreport process recording method and used it to obtain one particular real-time measure of affective response to TV ads. The emotion is that of "warmth" and the measurement procedure is the "Warmth Monitor." Warmth has face validity as an emotional response to the content of commercials. The Warmth Monitor method is appealing because of its simplicity and its nonverbal, immediate, and continuous registration nature (it only requires moving a pencil line steadily down a page of paper and to the right when "warmth" is experienced). The method does not require the use of recording equipment and thus has very low setup costs. It may have benefits over alternative push-button or dial devices because of the directness of the task and of a well-exercised behavior (drawing a line). A Warmth Monitor trace can also be electronically processed with scanning equipment. In this research, the Warmth Monitor method will be used as an application example because of (1) its relevance for the field and (2) the availability of previous published evidence in the consumer research literature. Following Holbrook and O'Shaughnessy (1984), warmth is deemed to be an acute, specific, and reactive emotion, changing quickly in response to features of ads. It should therefore also be particularly well suited for continuous-response measurement.

588

JOURNAL OF CONSUMER RESEARCH

Aaker et al. (1986) offer an assessment of the validity and reliability of their measure, investigate relationships between warmth and general advertising responses (such as ad liking and purchase likelihood), and examine ad-sequence effects. The reliability of the Warmth Monitor measure (hereafter denoted "warmth score" when referring to the data generated by the Warmth Monitor method) was established by means of test-retest correlations. Its validity was mainly inferred by concurrent correlation with a physiological measure, namely the GSR, sometimes referred to as electrodermal response (EDR). Galvanic skin response has been discussed in several places in the consumer research literature (e.g., Bagozzi 1991; Kroeber-RieI1979; Stewart and Furse 1982). It has exerted a certain appeal on practitioners as recording a spontaneous and hence "truthful" response to stimuli, that is, not biased by emotional, cognitive, or verbal response barriers. Galvanic skin response, as a nonverbal, continuous-response measure, is a good candidate for the validation of the warmth score because it is an index of physiological arousal. The Aaker et al. (1986) article is one of only a few to report strong evidence linking skin response to advertising; to some extent, it validatesGSR while it validates the Warmth Monitor. In the spirit of replication and cumulative science development, both measurement methods need further documentation of their properties, as do the constructs that they presumably operationalize. Our critical reexamination of warmth (the construct and the measurement method) is inspired by a number of questions prompted by Aaker et al.'s study, but with implications for emotional process tracing in general. 1. At the conceptual level, there is some doubt, based on both theory and common sense, about the validity of the warmth score and of emotional process traces in general. There is reason to expect GSR and warmth scores to be only modestly related, if at all (Bagozzi 1991). Warmth is defined as a spontaneous emotional response offeeling and empathy. The Warmth Monitor method, however, implies an amount of central control and cognitive mediation. It requires cognitive activity from respondents, to receive and translate each element of the message in keeping with its felt warmth and to instruct their muscles to move in a particular way to record responses. Galvanic skin response, according to most sources, is an indicator of reflexive arousal and of an orienting response (Ohman 1979; Siddle and Spinks 1979; Spinks, Blowers, and Shek 1985). Arousal is not the same as warmth, although warmth, as one emotion among many, could be expected to be accompanied by arousal. Galvanic skin response, by contrast, is supposed to occur automatically, without any conscious control by the subjects. 2. Aaker et al. (1986) give a definition of warmth that is imprecise. They define warmth to be "a positive, mild, volatile emotion involving physiological arousal and precipitated by experiencing directly or vicariously a

love, family, or friendship relationship" (p. 366). Warmth responses typically occur as the result of connection with social objects like persons, animals, organizations, or institutions. When viewing ads, warmth responses can occur as the result of vicarious connection with the social objects portrayed. Presumably, warmth can also be stimulated by viewer empathetic reaction to other elements of the ad such as the music background or familiar settings. All this suggests warmth is an emotion that occurs in response to a very wide range of stimuli and situations. This lack of specificity tends to be true for most of the extant process-tracing services, where the trait being measured has received very little construct definition and development. Although GSR has been demonstrated by Aaker et al. to measure quick changes in arousal, it is not necessarily related to warmth reactions. Arousal occurs as a result of many non-warmth-inducing elements of ads, such as other emotions or collative stimulus properties (Bagozzi 1991). If warmth emotions indeed induce arousal, this could be lost among other sources of arousal present in any ad execution. 3. The Warmth Monitor method is attractive, but its validity still requires further proof, as is generally the case for process-tracing methods. There is the possibility of a strong method factor in the measures the Warmth Monitor yields, for the following reasons: (11 the Warmth Monitor requires a relatively standard arm movement in tracing a line down a page; (2) there is an absence of control on the speed of this movement in tracing the line, and an assumption is made that the distance along the trace and the distance measured vertically down the page are related linearly; and, (3) in the case of the Aaker et al. (1986) procedure, the trace is continued from one commercial to the next without recentering on the neutral position. By not returning to the neutral position before each ad, the Warmth Monitor begins each ad after the first from the position the hand was in at the end ofthe previous ad, thus possibly resulting in the recording of artifactual carryover effects. Further, the easy implementation of the recording is balanced by the necessity of making a number of decisions at the coding stage, which brings some arbitrariness to the measurement, such as (1) the number and type of segments into which the warmth score trace will be divided, and (2) the choice of the parameter that best describes the warmth trace for each segment.

RESEARCH OBJECTIVES Our primary purpose is to discuss the analysis of process-tracing measures for the study of audience emotional responses to commercial stimuli. We do this by using the emotional response of warmth and the process-tracing method of the Warmth Monitor because of their representative nature for emotional response in general. We begin by focusing on the reliability of the warmth scores. The objective is to demonstrate that warmth

PROCESS TRACING OF EMOTIONAL RESPONSES

measures have acceptable true variance for decisionmaking purposes and/or for research. Next, our attention turns to the validity of the warmth construct. First, we reexamine the Aaker et al. (1986) finding that the warmth measure can be validated by concurrent correlation with GSR. Second, we explore the measure's usefulness in gauging reactions to specific transient elements of TV ads. Third, we try to detect the presence of ad-sequence effects. Fourth, we assess the discriminant validity of warmth and of the warmth measure by examining how distinctively warmth is measured relative to other ad responses.

Specific Propositions The following propositions, while stated in terms specific to the Warmth Monitor application, are intended to generalize to emotional process tracing. P1: Emotional response traces have sufficient precision to be useful in advertising decision making. Aaker et al. (1986) report an acceptable average within-individual test-retest correlation of .81, under some explicit conditions that could imply this high correlation is the result of a common artifact in the test and retest administration. P2: Warmth and GSR scores have overlapping construct domains. This proposition implies a significant correlation between both measures. The primary evidence in Aaker et al. (1986) to support the validity of their warmth measure is an average within-individual correlation of .67 between II-segment warmth and GSR scores for ads that were pretested as "warm." As discussed earlier, this correlation ought not necessarily be large. Also, it is not evident that the average within-subject correlation is the relevant validity criterion. Finally, it is unclear why Aaker et al. restrict their validation evidence to "warm" ads. The reason could be that a warm ad is one with substantial fluctuations in the warmth trace; this implies that the true variance and hence the reliability of warmth scores will be higher and hence yield less attenuated correlations for "warm" ads. The presence of high warmth-GSR correlations for warm ads is a demonstration of validity; their absence for nonwarm ads, contrary to the claim by Aaker et aI., cannot be construed as evidence of discriminant validity but could be due to restricted true-warmth variance in nonwarm ads. P3: Emotional response traces, warmth scores in particular, will be significantly correlated with GSR only when both measures are administered simultaneously to the same subjects. As an alternative to Proposition 2, the correlation between the two measures might be due to reactivity

589

between them. Aaker et al. examined a situation in which Warmth Monitor and GSR procedures were administered simultaneously and compared it with a condition in which only GSR was measured. Their procedure, involving comparison of average intersubject correlations, did not yield evidence of reactivity. Our analysis approach, to be described below, allows more straightforward assessment of possible reactivity. P4: There is higher correlation between warmth and GSR scores for ads deemed to be warmer on average. Aaker et al. (1986) report an average within-individual correlation of .67 between GSR and warmth scores for "warm" ads, as selected in a pretest, with considerably lower or negative correlations for "humorous," "irritating," and "informative" ads. It is not obvious that one should expect a GSR-warmth score relationship only for ads that are deemed to be "warm" on average (indeed, if the ads exhibited high warmth throughout, one would expect this to truncate the variability of the warmth score and make it less correlated with GSR). Nevertheless, we attempted to replicate this finding in the present research, because it may have important implications for the conditions under which emotional response traces will show validity. P5: Warmth scores respond only to specific warmth-inducing characteristics of ads. If warmth scores are valid measures of warmth, they should specifically respond to "emotion" -arousing components of the ads in order to show construct validity. This was not tested directly by Aaker et al. (1986), although they did demonstrate that their measure was volatile and stimulus responsive. P6: Warmth-sequence effects exist across successive commercials. A considerable 'part of the Aaker et al. (1986) article is devoted to showing that the processing of subsequent ads is influenced by the degree of emotion elicited by the previous commercial. While such a finding is of obvious interest, the Aaker et al. procedure, to be discussed later, may be biased toward observing such effects. The nature of this possible bias will be discussed below. P7: The warmth measure has discriminant validity. The Aaker et al. (1986) study mainly establishes convergent validity; the litmus test often lies in discriminant validity. Aaker et al. claim discriminant validity by correlating change in warmth (from beginning to end of commercial) with postexposure ratings of the ads on warmth, humorous, informative, and irritating dimensions, obtaining a positive (.45) correlation for warmth, but low or negative correlations for the others. But

590

JOURNAL OF CONSUMER RESEARCH

warmth change from start to end of the commercial is not equivalent to the volatile warmth score response from segment to segment of the commercial, and it is the latter that needs assessment of discriminant validity. Also, their postexposure measure corresponds to "overall ad warmth," which is not the construct measured by the Warmth Monitor.

METHOD In order to investigate the measurement properties of the warmth score and Propositions 1-6, we opted for the combination of a between- and within-group quasi-experimental design. Subjects were final-year Belgian business administration undergraduates taking a class in consumer behavior. The subjects were individually exposed to a videotape of TV commercials and other stimuli in the Consumer Research Lab at the University of Leuven, Belgium. This lab was arranged as a living room in order to put subjects at ease during ad exposure and response measurement.

FIGURE 1 EXPERIMENTAL ADS: WARMTH AND ACTIVATION CONDITIONS

Activation High ConditionWA

High

Low Condition Wa

Woolmark (71

Bece/ (151

Minute Maid (BI

Merci (101.

Whiskas (101

Zwitsal (BI

Warmth Condition wA

Low

Conditionwa

Coca-Cola (101

Brabantia (71

Toshiba (121

Bonzo (121

Page (51

SEB (61

NOTE.-In parentheses are number of three-second segments.

Measurement Conditions The responses to commercials were recorded under one of three conditions. Subjects in the first condition were given only the Warmth Monitor task, using the exact instructions from the Aaker et al. article. (Two changes were made: the instructions were translated into Dutch and the trace started at the neutral point on a new page for each commercial.) The sample size of this group was 14 respondents. The warmth-score data obtained from this group is denoted Warmth(O), to indicate that no GSR procedure was administered concurrently with the Warmth Monitor. Subjects in the second condition were administered only the GSR procedure. 1 They were told we were interested in spontaneous responses to commercials as measured through physiological indexes. Electrodes were affixed to the palm of the subject's nondominant hand. Subjects were given time to adjust to the task environment. When their GSR traces had stabilized, they were shown the tape containing the commercials. The sample size in this group was 21 respondents. The measurements obtained from this sample are denoted GSR(O), since no warmth-score data were collected along with the GSR data. In the third condition, both measures were collected simultaneously. The subject's dominant hand was assigned the Warmth Monitor task; the other hand was used for GSR measurement. The sample size for this condition was 30 SUbjects. Both measures were provided iSkin response was measured using the ZAK Biosystems EDAjS Module, recording electrodermal conductance at half-second intervals with electrodes affixed on the palm of the nondominant hand. The GSR data handling software is by INTER TEST (Netherlands) and ROGIL (Belgium).

by subjects in this condition and are denoted Warmth(+) and GSR(+).

Stimuli The "treatment" consisted ofa videotape containing 12 real TV commercials and insertions of one of four filler "bogus commercials" after every real one but the last. The 12 ads were preselected to represent extremes on two dimensions, warmth and activation, using the procedure described in Aaker et al. (1986) for their dimensions of warmth, information, humor, and irritation. A pretest was performed on a comparable student sample (n = 50) who rated 50 commercials on two scales, warmth and activation-excitement (i.e., for stimulus-induced orienting response). Retained commercials included three each for the following conditions: (1) high warmth, high activation (denoted W A), (2) high warmth, low activation (Wa), (3) low warmth, high activation (wA), and (4) low warmth, low activation (wa). The ads thus constituted a form of orthogonal design within the study. The scores on warmth and on activation differed significantly between the chosen ads at each extreme of the scales; the average warmth and activation ratings of the commercials were nearly uncorrelated. The ads used in the study and their lengths are shown in Figure 1. The four "bogus commercials" consisted of various "bouncing ball" animations. Showing a bogus ad after every "real" one has the advantage of letting the GSR trace stabilize between commercials. Further, this allowed the subjects to turn to a new sheet of test paper for warmth trace recording, thus circumventing possible carryover, end-of-ad effects that might have occurred

591

PROCESS TRACING OF EMOTIONAL RESPONSES

in the Aaker et al. studies. In order to reduce potential order effects, four videotapes were used containing the same real commercials in systematically varied sequences.

Analysis The selection of the unit of observation in the study implied some choices. Aaker et al. (1986) measured warmth as the value of the trace at five (sometimes 11) equidistant points along the vertical axis of the trace of each commercial, irrespective of the total duration of the commercial. This can be considered arbitrary, mainly for three reasons. (1) The measures do not relate to the same time intervals for commercials of different length. (2) Equidistance along the vertical axis on the Warmth Monitor form does not translate into equidistance in time; pronounced changes in warmth will lead to large horizontal movements of the warmth trace and, possibly, to smaller vertical movements per unit time. (3) Measurements are at a specific instant in the commercial, rather than throughout a segment in the commercial. It may be preferable to divide a commercial into meaningful scenes (of variable duration) and to record a parameter typical for the response during that segment. Such a measurement would be difficult to obtain, however, without sacrificing some of the attractive simplicity of the Warmth Monitor procedure. We compromised on the following measurement conventions. (1) The vertical axis along the warmth trace was divided into as many segments as there were three-second intervals in the commercials. On average, therefore, each such segment corresponds to a three-second commercial interval. Three seconds was deemed to be a useful unit of time, since a scene in a commercial generally lasts at least three seconds, since three seconds should suffice to include any lag in response production, and since it was feasible to code each three-second segment for a number of properties as described below. (2) The maximum of the trace within each segment was chosen as the representative parameter for the subject's response during that segment. Our unit of analysis thus provides one measure of the response pattern during a segment of the commercial rather than a response descriptive of only one single instant. The same procedure was used to convert the GSR trace into discrete numbers. Here, however, the segments have exactly three-second length, as the time axis is strictly controlled in the measurement. Although GSR has some latency time, three seconds is a long enough interval to capture the response. All individual responses were ipsatized within the respondent, that is, standardized relative to the mean and standard deviation of the individual's warmth and GSR scores, respectively. This removes individual differences

TABLE 1

SPLIT-SAMPLE SPLIT-HALF RELIABILITY FOR WARMTH AND GSR MEASURES

Measure

n

Correlation between halves

Warmth(O)

14

.74

.85 [.88]

22.24

Warmth(+) GSR(O) GSR(+)

30 21 30

.92 .51 .42

.96 [.95] .68 [.60] .60 [.64]

11.30 89.00 180.00

Split-half r

n for R= .90

NOTE.-The measures have "0" and "+" in parentheses to denote conditions in which the single measure was taken and in which both were taken, respectively (see Fig. 1). The numbers in brackets are obtained when using the average sample-mean variance as an error variance estimate in the reliability formula; the reliability estimates computed on the (arbitrarily) split samples are generally quite close to these values. The n for R = .90 is the estimated sample size needed for a reliability of .90.

in the sensitivity of respondents to the two measures (see Ben-Shakhar 1985). 2 In what follows, comparisons-or correlations-are made based primarily on average responses (averaged over subjects) to commercial segments. The 12 real commercials together totaled 110 segments or 330 seconds of programming. Each of the 110 commercial segments receives an average score, and it is these averages that are the basic elements of the analyses discussed below. The bogus commercials each lasted 10 segments or 30 seconds.

RESULTS Reliability of Warmth and of GSR Reliability is a relative concept, defined in relation to the stimulus domain (e.g., single ad or series of ads) and to the sample size. Since the error variance (i.e., the sampling variance of the mean warmth or GSR score) is known, the reliability could be estimated directly except that there is no assurance of homoskedasticity for error variance between commercial segments. The "average" reliability can only be computed if an "average" error variance is entered into the reliability formula. This difficulty is avoided if we (randomly) split the subject sample in half and compute two parallel series for the mean warmth or GSR score for each segment for each sUbsample. Next, we correlate these series and compute the split-half reliability at the actual sample size used in our study and estimate the sample size required for a given level of reliability, for 2Ipsatizing the scores does not eliminate error variance due to the self-pacing character of the Warmth Monitor. (The trace midpoint for any two individuals, e.g., cannot be presumed to reflect their reactions to the middle of the commercial.) This is a potentially important source of measure unreliability.

592

JOURNAL OF CONSUMER RESEARCH

example, .90 (Nunnally 1978). This gives the results shown in Table 1. The reliability coefficients are such that the hypothesis of no true variance can safely be rejected (the relevant domain size is T = 110 segments). Obviously, if these results were used to extrapolate to individual measurements (n = 1), the resulting reliabilities would be extremely low and show that correlations based on individual data could hardly ever reach significant values. It is evident that, at the sample sizes practiced, the warmth measure has sufficient reliability. This is not the case for the GSR scores. The reliabilities for GSR are still sufficient to carry out research, but they obviously limit the extent to which other variables can be correlated with GSR scores. 3

Do Warmth and GSR Scores Measure the Same Thing? Our evidence regarding Propositions 2 and 3 is found in Table 2. This is a kind of multitrait-multimethod matrix, showing the reliabilities down the diagonal and correlations below the diagonal (Bagozzi and Yi 1991). The above-diagonal entries correct the correlations for attenuation due to lack of reliability (see Nunnally 1978).4 All these entries, again, are computed on sample-mean scores across all 110 commercial segments. The results show high monotrait correlations, certainly when corrected for attenuation. This confirms that the same trait is measured irrespective of the measurement condition (i.e., whether warmth and GSR scores are measured concurrently or independently). We see that, in opposition to Proposition 2, warmth and GSR scores are but little, if at all, related. The heterotrait correlations are insignificant, except for the concurrent measurement condition, where the correlation is positive and significant, though still weak. This correlation is in contrast to the average intrasubject warmth-GSR correlation of .67 reported by Aaker et al. (1986) for "warm" ads. 5 Only when Warmth Monitor and GSR procedures are administered concurrently (the situation hypothesized with Proposition 3) is the correlation significant, and, even in that condition, the correlation is low (.28). There is no significance when correlating one of the independent measurement con3Split-sample correlations for each ad were also computed, since the trace obtained for a single ad may be the relevant domain of variation. Individual ad reliabilities for warmth scores ranged from zero to virtually perfect. The average reliability for GSR was very poor at the sample sizes employed, and the heterogeneity in reli· abilities was pronounced. is the true correlation between X and Y and the latter 4If are measured with reliabiIities Rx and Ry, then the observed correlation 'xy will be attenuated to E(,xy) = 'xy\l'RxVR;:s; 'xy. 5In a later study, Stayman and Aaker (1993) found a much smaller, although still significant, .24 correlation between warmth and GSR. Correlations based on individual sequential responses are likely to be subject to autocorrelation bias, which might account for part of their observed correlation.

'XY

TABLE 2 REUABIUTIES AND CORRELATIONS OVER MEASURES AND CONDITIONS Warmth(O) Warmth(O) Warmth(+) GSR(O) GSR(+)

.858 .825 (p = .0001) .046 (p = .637) .052 (p = .588)

Warmth(+)

GSR(O)

GSR(+)

.91b

.06b

.07b

.968

.10b

.28b

.083 (p = .390) .213 (p = .025)

.688

.83b

.529 (p = .0001)

.608

NOTE.-The measures have "0" and "+" in parentheses to denote conditions in which only the single measure was taken and in which both were taken. respectively (see Fig. 1). T = 110 segments. aReliabilities. bCorrelation corrected for attenuation because of unreliable measurement.

ditions with one of the concurrent measurement conditions (i.e., Warmth(O) with GSR( +), or GSR(O) with Warmth(+». Possibly one measurement task contaminates the other. There could be two (observationally indistinguishable) reasons for this result: (1) the simultaneous measurement procedure introduces a method artifact in both measures; that is, instructing the respondent to record both warmth and GSR introduces a common factor to both measures; and (2) the correlation is due to the fact that the same respondents are measured; that is, it is due to contemporaneous error variance around the mean. We conclude that warmth and GSR scores nominally measure different things, certainly when measured on independent samples. When Warmth Monitor and GSR procedures are administered simultaneously, there is slight evidence of a joint factor.

Is There Higher Correlation between Warmth and GSR Scores for Warm Ads? We examined Proposition 4 with the heterotrait correlations in Table 3, again computed on sample average scores across segments but for each ad separately. Sample sizes were quite small, corresponding to the number of segments in each ad. Contrary to the Aaker et al. (1986) results, not many significant correlations were observed, and, even more revealing, there is no discernible pattern. There is a tendency for GSR and warmth scores to be related under high warmth and high activation conditions, but this only occurs when relating GSR(O) with Warmth(+ and 0). There is no significant correlation between warmth and GSR scores for high-warmth and low-activation conditions. Therefore, we were unable to find evidence, even under conditions of high-warmth ads, that GSR and warmth scores measure overlapping constructs. 6 6Intrasubject warmth-GSR score correlations for separate commercials (similar to those calculated by Aaker et al. [1986]) were insignificant or weak whether or not the ads were judged warm. The Aaker et al. findings were not replicated in our results.

PROCESS TRACING OF EMOTIONAL RESPONSES

593 TABLE 3

HETEROTRAIT CORRELATIONS BETWEEN WARMTH AND GSR UNDER DIFFERENT WARMTH AND ACTIVATION CONDITIONS GSR(O) correlated with

GSR(+) correlated with Ad condition WA (high warmth/high activation): Minute Maid Woolmark Whiskas Wa (high warmth/low activation): Becel Merci Zwitsal wA (lOW warmth/high activation): Coke light Toshiba Page wa (lOW warmth/low activation): Brabantia Bonzo Seb

Warmth(+)

Warmth(O)

Warmth(+)

Warmth(O)

.53 -.59 .23

.57 -.46 .14

.75" -.79" .85""

.80" -.67 .80""

-.27 .32 -.12

-.69"" .34 -.16

-.24 -.23 .29

-.20 -.06 .30

.28 .19 .07

.05 .17 -.06

.88"" .20 -.32

.82" .58" -.28

.35 .60" .66

.18 .58 .66

-.09 .60" .35

-.50 -.52 .46

NOTE.-The measures have "0" and "+" in parentheses to denote conditions in which only the single measure was taken and in which both were taken, respectively (see Fig. 1). Sample size for correlations is the number of segments for each commercial (ranging from five to 10).

'p < .05.

"p < .01.

Does Warmth Score Respond Only to Warmth-inducing Aspects of Commercials? Proposition 5 is really double: the warmth score (1) correlates with particular characteristics of TV commercial segments that are related to warmth and (2) does not correlate with characteristics that are not related to warmth (e.g., more general collative properties). Testing whether the results for the segment characteristics offer supporting evidence for our proposition is difficult in the absence of a controlled experiment. As an approximation, we consider the univariate correlations between the segment characteristics and the sample-mean warmth-score response and contrast these with the relationship (sign) predicted by a panel of judges. The panel of judges, consisting of seven management school research assistants, was asked independently to consider each characteristic as well as the definition of the warmth response. They were then asked to predict for each characteristic whether they expected a significantly positive, a neutral, or a negative correlation with the warmth-score trace; their responses were coded + 1, 0, or -1, respectively. Interjudge consistency was 59.1 average percent agreement. 7 Each characteristic was then coded in terms of the sum of the judges' scores, which could range from + 7 to -7. The characteristics, their judged warmth-inducing nature, and correlations with the warmth measure 7 Although we would have preferred a higher interjudge reliability, most discrepancies were between neutral and signed directions. Only 3.6 percent of possible comparisons resulted in positive-negative mismatches.

Warmth(+ and 0) are shown in Table 4. Because of the high correlation between Warmth(+) and Warmth(O) series, we pooled both series, to achieve a sample size of 220. The table is divided according to whether the judges on balance agreed that the characteristics were "warmth" correlated or not. Remarkably, all characteristics independently judged as warmth inducing correlated positively or nonsignificantly with the warmth measure, whereas all other characteristics correlated negatively or non significantly. We conclude there is evidence in favor of Proposition 5. The warmth score does respond to transient segment characteristics in a way that supports its convergent and discriminant validity. To study the proposition further, we ran separate dummy-variable regressions with warmth score as the dependent variable and with explanatory variables in three groups: first, ad condition (i.e., warmth/activation category); second, segment sequence (first, second, middle, next to last, last); and third, segment properties (specific coded characteristics of the ad execution). Also, we added the GSR measure as a covariate. Table 5 contains the results of these sequential regressions, along with the adjusted R 2s achieved by adding each set of explanatory variables. The coefficients shown are as estimated for the complete model. s From Table 5 we observe the following. 8Similar sequential regressions were also run using lagged warmth score as the dependent variable, in order to investigate possible delays in the warmth response or the possibility of memory factors in the time series. Also, regressions were run using the segment-to-segment differences in warmth scores, with the hope of eliminating the dependence between successive measures. Neither type of regression provided useful additional insights.

594

JOURNAL OF CONSUMER RESEARCH TABLE 4

PAIRWISE CORRELATIONS BETWEEN WARMTH MEASURE AND COLLATIVE PROPERTIES Collative property Warmth-inducing: Emotional experience Nonverbal friendly interaction Music Song Props (cartoon figure, etc.) Protagonist laughing Verbal interaction Two actors Three or more actors Bodily movement Non-warmth-inducing: One scene Close-up One person Two scenes Offscreen voice Product and brand displayed Product demonstration Three or more scenes Brand name superimposed Product displayed Informational text superimposed No person

Judged score'

Correlation with warmth

p-value (two-tailed)

+7 +7 +7 +7 +7 +6 +6 +4 +3 +3

.42 .34 .47 .52 .36 .17 .02 .33 .07 .40

.0001 .0001 .0001 .0001 .0001 .013 .82 .0001 .30 .0001

+2 +1 0 0 -2 -3 -3 -3 -3 -4 -5 -5

-.06 -.18 -.11 -.04 -.40 -.19 .06 .10 -.01 -.05 .03 -.31

.35 .007 .09 .55 .0001 .005 .36 .14 .86 .51 .67 .0001

NOTE.-T = 220 because we pooled over Warmth(+) and Warmth(O). "Judged score is an agreement score among independent judges that elements of the ads would correlate with warmth.

1. Remembering that the reliability of Warmth(O) and

Warmth(+) sets a ceiling of approximately .90 on the variance to be explained, the fit of the regressions is excellent.

2. Ad condition, a nondynamic property of the commercials, captures three-fourths of the variance; remarkably, the differentiation is between the "wa" condition and the other conditions; "wa" ads offer relatively little gratification to the viewer. Apparently, the level of the warmth-score trace between commercials is not able to discriminate between warm and nonwarm ads. The average trace for all four conditions is shown in Figure 2.9 Aside from the difference in level between the "wa" condition and the other conditions, the pattern or shape of the warmth trace also differs between the "wa" condition and the other conditions. Figure 2 indicates that the warmth-score trace to these commercials in preset conditions is basically of only two types: "alive" or not. 3. The segment sequence has a significant impact and points to a "characteristic" shape for the warmth score trace, namely, for a below-average starting and leveling off (or declining) ending warmth-score trace. This pattern may be of relatively little interest if it is the result of an artifact (i.e., the trace starts at the neutral 9Note that Figs. 2 and 3 are plots of segment averages across ads at selected points in each ad; hence they obscure the segment-tosegment volatility existing in individual process traces.

point, and returns to neutral at the end of the commercial). Alternatively, commercials may on average evoke a typical warmth-score trace, a standard to which one may choose to adapt or to contrast one's specific ad. 4. Segment characteristics add a limited, but still significant, amount of explained variance. The hypothesized significances are generally obtained. 10 The coefficients of the segment property variables again, for the most part, are positive (or nonsignificant) for the warmth-inducing properties and the reverse for the non-warmth-inducing properties. The only exceptions are the three properties: "bodily movement," "offscreen voice," and "product + brand visible." 5. Galvanic skin response scores contribute a small but significant amount to explained warmth-score variance at the margin, again corroborating earlier evidence supporting a weak domain overlap. II

We conclude there is fairly strong evidence in favor of Proposition 5. The warmth monitor does differentiate IONot all properties listed in Table 4 occurred in Table 5. This is because some of them were collinear with other properties or their incidence was too small. "The variable used in the regression is GSR(+) X DV, where DV = 1 if the dependent variable is Warmth(+) and zero otherwise. When other variables are used (e.g., using GSR(O) or GSR(+)) as the regressor, the coefficient is nonsignificant. In other words, GSR explains variation in warmth scores only in the condition in which both measures are administered jointly.

PROCESS TRACING OF EMOTIONAL RESPONSES TABLE 5

REGRESSION OF WARMTH SCORE ON EXPLANATORY VARIABLES Warmth score Variable Intercept Ad condition:" WA Wa wA Segment position: b First Second Middle Next to last Collative segment properties: c Warmth-inducing properties: Emotional experience Nonverbal friendly interaction Music Song Props Laughing protagonist Bodily movement Non-warmth-inducing properties: Close-up One actor Two scenes Offscreen voice Product and brand visible Product demonstration Three or more scenes Product visible Information superimposed Information mentioned GSR-dummyd

Parameter estimate

Two-tailed

-.93

.0001

1.20 .84 1.39

.0001 .0001 .0001

-.27 .04 .26 .24

.02 .74 .005 .01

-.15 .35 .10 .18 .25 .19 -.48

.07 .0001 .23 .02 .0001 .13 .0001

-.07 .14 -.04 .29 .31 .03 -.10 -.39 .08 -.19 .39

.28 .24 .49 .0002 .0001 .54 .20 .0001 .22 .03 .002

p-value

"Adjusted Ff = .59. bAdjusted Ff = .65. "Adjusted R2 = .78. dGSR-dummy is defined as GSR(+) when dependent variable is Warmth(+) and zero elsewhere. Adjusted R2 = .79.

among properties of commercials in relation to their "warmth" character.

Sequence Effects in Warmth Measures Aaker et al. (1986) devote considerable attention to sequence effects, that is, the impact of emotional response to preceding commercials on the processing of subsequent ads. We discussed above why these results need further validation. We expect ad-sequence effects of the type described by Aaker et aI., but we also expect these effects to be modest compared to the main effect of the emotional response level of the ad itself, once the possibly confounding end-of-commercial effect in the Aaker et al. Warmth Monitor procedure is eliminated. However, by inserting bogus filler ads between commercials, our method was biased against observing sequence effects.

595

For this purpose, we proceed as follows: each subject in our Warmth(O) condition was exposed to a sequence of 12 commercials, with neutral bogus ads inserted between these commercials. Each individual warmthscore trace, except for the first commercial, can therefore be coded (1) for the warmth and activation level of the commercial itself and (2) for the warmth and activation level of the preceding commercial. Finally, each segment of the commercial can be coded for its position in the ad (first, second, middle, next to last, last).12 Thus, the data can be analyzed in an ANOVA framework, with the individual warmth score response (ipsatized) for the segment as the dependent variable and using the main effects and interactions of (1) the warmth level of the ad and (2) the warmth level of the preceding ad, (3) the activation level of the ad and (4) the activation level of the preceding ad, and (5) the position of the segment as the independent variables. Sequence effects would require the main effects or interactions of (2) and (4) to be significant. The result is shown in Table 6. The sequence effect is evident, since the factor "previous ad warm" has a statistically significant impact on warmth score after controlling for current ad warmth and the other factors, although its interaction with segment position is insignificant. Figure 3 shows the nature of the sequence effect. In all segments, prior nonwarm ads lead to higher average warmth-score values than when prior ads are warm. We conclude for Proposition 6 that sequence effects do occur even after eliminating potential artifactual effects not controlled in the Aaker et al. study. The effects are quite small, as can be seen (in Table 6) by the low sum of squares for prior ad warmth. As indicated above, sequence effects were likely biased downward in our design by the use of the interspersed neutral bogus ads. Given that fact, the small sequence effects observed are actually quite impressive. The sequence effects discussed in Aaker et al. (1986) correspond, in our ANOV A, to the interaction effect of previous ad warmth and segment position, which are nonsignificant. We note that the activation level also gives rise to modest sequence effects.

Discriminant Validity of Warmth and of the Warmth Monitor To examine Proposition 7, we study how subjects respond to different process-tracing tasks applied to the same stimuli. Do they provide different responses if asked to trace other cognitive and affective reactions to the ads by means of the same process-tracing methodology? If we can demonstrate this, it would provide evidence for the discriminant validity of the warmth measure. 12Note that segment position is not equivalent to time, since the commercials varied in length. "Middle," therefore, stands for third, fourth, and so on up to the one before the next-to-last segment.

JOURNAL OF CONSUMER RESEARCH

596 FIGURE 2

MEAN SCORE BY AD WARMTH AND ACTIVATION

Warmth(+) 1.---------------------------------------------------~

0.5

-0.5 t------~ -1 _1.5L------------L----------~------------~----------~

1

4

3

2

5

Segment Position NOTE.-Segment positions are as follows: 1, first; 2, second; 3, middle; 4, next to last; and 5, last. Traces are as follows: = the "wa" condition (low warmth, low activation); _ = the "Wa" condition (high warmth, low activation); --+- = the "wA" condition (low warmth, high activation); - B - = the "WA" condition (high warmth, high activation).

Design. The data for this phase were collected in a separate study devoted to the collection of emotional response data and judgments of the attention-arousing effect of the commercial segments, again using the Warmth Monitor method to perform the process-tracing task. The commercial stimuli used were the 12 ads mentioned in the studies above, taped in different sequences in order to neutralize potential order effects. Subjects from the same student pool were given one single trait on which to perform the process-tracing task for all 12 commercials. The subjects received the instructions shown in Aaker et al. (except for translation into Dutch, the definition of a different response than warmth, and the instruction to start a new sheet for each commercial). The second page of instructions showed an example of a process trace. Subjects were next given a practice trial on a test commercial. Finally, the process tracing measure was taken for each commercial separately; subjects were allowed time to turn over the test form for each new commercial in order to avoid artifactual sequence effects, Each response was measured on a sample of 10 respondents, except for the attention-raising power, which was measured on a sample of 20 respondents. The process-tracing results were converted into sample mean scores per (approximate) three-second segment using the same procedure as in the main study. All analyses performed below are based on sample-mean correlations across 110 commercial segments. We underline the fact that each trait is measured on a different sample of respondents, so that errors around the sample means should be independent.

Response Dimensions. Based on Holbrook's (1986) discussion of the Plutchik (1980) emotion inventory, six of the latter's original eight emotion dimensions (omitting fear and sadness) were retained for process tracing by means of the Warmth Monitor method: joy, surprise, anticipation, anger, acceptance, and disgust. For each of these emotional responses and for the judged attention-getting power, four anchor points were defined on the response sheet in analogy with the Warmth Monitor procedure. To the data series, we also added the warmth measures obtained in our first study (i.e., to obtain a total of nine Warmth(O) and Warmth( series. Our research question is whether the seven traits, which are supposed to be independent and only slightly correlated with the emotional response of warmth, do indeed appear as unrelated when measured by the same Warmth Monitor method. The sample mean data were used for a principal components factor analysis. The result, shown in Table 7, reveals one dominant axis, with an eigenvalue of5.98 or 67 percent of the variance, while the second eigenvalue does not exceed 1.00. The two-factor solution breaks out disgust as the nucleus of an acceptable second factor under varimax rotation. 13 Generally speaking, the evidence is that the warmth



13We also examined the unidimensionality ofa scale composed of the nine indicators and subsets of these using LISREL (Steenkamp and van Trijp 1991). Although all nine do not appear to tap the same underlying dimension, a scale composed of the items surprise, anticipation, acceptance, and Warmth(O) cannot be rejected as unidimensional (X 2 (2) = 1.84, P < .398, AGFI [adjusted goodness-of-fit index] = 1.000).

597

PROCESS TRACING OF EMOTIONAL RESPONSES TABLE 6

EFFECT OF PRIOR AD WARMTH AND ACTIVATION ON WARMTH MEASURE Source

Sum of squares

F-value

p-value

94.10 263.84

136.44 382.54

.0001 .0001

1 1 4

6.55 3.33 48.39

9.50 4.83 17.54

.0021 .0280 .0001

4 4 4 4

2.61 25.92 2.63 62.33

.95 9.39 .95 22.59

.4359 .0001 .4315 .0001

df

Current ad: Warmth Activation Previous ad: Warmth Activation Segment position Interactions: Previous ad warmth X segment Current ad warmth X segment Previous ad activation X segment Current ad activation X segment NOTE.-Overall F(24,3009) = 47.58; P < .0001; Ff = .275.

construct as measured by the Warmth Monitor method lacks discriminant validity. It should be noted that, subsequent to the research conducted here, Stayman and Aaker (1993) published results of a comparable study which found that the Warmth Monitor was not simply measuring liking and again claimed evidence for both convergent and discriminant validity of the warmth measure. As in the case of the Aaker et al. (1986) article, however, the latter was done by correlating warmth change (from beginning to end of ad) with postexposure measures of entire ad warmth, humor, irritation, and liking. The change measure, based only on two points (first and last) in the ad, is not the same as the dynamic process-tracing measure that may fluctuate up and down over the course of the ad. Thus, the warmth trace is not validated by such analysis.

DISCUSSION Process-tracing methods have existed for many years, but their benefits for theory and practice (e.g., for understanding advertising impacts) have yet to be fully realized. In this article we provided insights into the analysis of process-tracing data, while also attempting to validate one such method, the Warmth Monitor, thereby replicating and extending research reported in the Aaker et al. (1986) article. A most interesting aspect of process tracing methods is the opportunity to examine patterns of response to transient external stimuli over time. Yet, for the most part, the methods' data have not been analyzed in a way that exploits this benefit, at least by consumer researchers (see, e.g., Boyd and Hughes 1992). We posit the best way to do so is to use average segment responses by samples of subjects, thus emphasizing intersegment variation rather than intersubject variation. Comparisons of segment averages (directly or through correlations) provides the basis for examining patterns of re-

sponse to continuously changing stimuli and for judging the relative effectiveness of specific elements of, for example, TV commercials.

General Remarks regarding Process Tracing Process-tracing measures should be developed and tested thoroughly in order for us to have methods to better understand the dynamic consumer responses to transient marketing stimuli. To fully exploit such methods, we should also consider carefully the appropriate sources of co variation that should be examined. We have argued that, for the purposes of examining impacts of particular communication elements or of patterns of response over time within a commercial, the important source of variation is between segments, not between subjects. Further, we think that processtracing data should be investigated using simple or even complex time-series analysis methods in order to respect the temporal dimension of the data and study design. Many issues involving process tracing have surfaced in this project. For example, Is the three-second interval the proper one? Should a value other than the highest in the interval be used as the parameter to describe segment responses of individuals? How should we examine patterns of response within ads? Are there characteristic response patterns across segments of any ad that should be used as baselines? Are there advantages to designing ads with particular emotion-response patterns? The most important questions, from an academic and a pragmatic viewpoint, deal with the use and utilization of process-tracing data. Process traces yield an overtime pattern of responses, for example, the temporal response pattern to a commercial. The shape itself of this pattern is of interest. How should we describe it (i.e., by means of which parameters) and what are the different response pattern types, if any? Process traces can be used as dependent variables. The question, here, is understanding the determinants

598

JOURNAL OF CONSUMER RESEARCH FIGURE 3

MEAN SCORE GIVEN PRIOR AD WARMTH

Warmth(+) 0.1.-------------------------------------------------~

o~----------------------~~--~--------------------~~

-0.1 -0.2 -0.3 _0.4L------------L----------~------------~----------~

1

2

5

4

3 Segment Position

=

NOTE.-Segment positions are as follows: 1, first; 2, second; 3, middle; 4, next to last; and 5, last. Traces are as follows: prior ad warm.

of the process trace and the causes of differences in response pattern to different commercials. Later responses may be triggered by initial responses within the same commercial or across commercials (e.g., the modest sequence effects demonstrated in our study). Overall response patterns may differ because of differences in commercials, in audience, or in exposure/processing conditions. The development of a process trace should, in particular, be studied on the basis of hypotheses and theories concerning stimulus processing. Treated as causal variables, process trace indicators (e.g., the temporal shape) could be examined for their impact on the ultimate effect of the commercial stimulus (e.g., in terms of brand awareness, evaluation, or purchase intention). The process-tracing methodology is applicable to various dimensions of processing, for example, attention, cognitive processing, emotional responding, and so forth. Effort is needed to prove the discriminant validity of the indicators measured by the process tracing methods and analyzed on a between-segment basis.

Specific Remarks regarding the Warmth Monitor Using segment average data, we investigated the reliability and validity of the Warmth Monitor as a procedure to measure the "warmth" emotional construct. We observed the warmth measure (unlike the GSR process-tracing measure) to be quite reliable, even with small samples. Unlike Aaker et al. (1986), we were unable to provide much evidence of convergent validity of the warmth

=

prior ad nonwarm;

--+---

measure with GSR. A reason for this might be the weak theoretical correspondence between warmth and arousal (or orienting response), the latter being associated with GSR. Also, there is a confusion between warmth, defined as a volatile emotion with fleeting character, and the use of the construct to describe entire ads as being "warm" or not. There ought to be at least some overlap of the two constructs' domains, and, as we observed, a small portion of the variation in warmth scores was found to be explained by GSR scores after controlling for other aspects of the ads (i.e., warmth/ activity condition, the position of the ad segment, and the characteristics of the segment) and then only where warmth and GSR scores were measured simultaneously. We found fairly strong evidence that the warmth measures responded in a transient fashion differently to warmth-inducing properties of the ads' segments than to non-warmth-inducing properties. Thus, one dimension of discriminant validity was supported. However, it is still not clear that what was responded to is "warmth" or that warmth is different from some other emotional responses to the ad segments. Using a procedure that was biased against carryover responses, we detected sequence effects; that is, "warm" commercials tended to cause the next commercials to be viewed as less warm than when preceded by "nonwarm" commercials, controlling for other factors. This suggests some nomological validity to the warmth measure since sequence effects in emotional response across successive ads are expected. Perhaps of greatest concern to prospective users of the Warmth Monitor wanting to measure warmth (or other constructs, for that matter) was the lack of dis-

PROCESS TRACING OF EMOTIONAL RESPONSES TABLE 7

FACTOR ANALYSIS OF NINE PROCESS-TRACING EMOTION MEASURES

Variable Joy Surprise Anticipation Anger Acceptance Disgust Attention Warmth(O) Warmth(+)

Varimax-rotated two-factor solution

One-factor solution loading

Factor I

Factor II

.94 .88 .85 -.90 .90 -.64 .92 .91 .93

.94 .88 .85 -.90 .90 -.65 .92 .91 .93

-.04 .40 .39 .37 .01 .71 .05 .05 .06

criminant validity observed in our last study. It is not obvious whether the validity problems stem from the Warmth Monitor or the concept of warmth. Warmth is not a central emotion in theories of emotion, it is vaguely defined and overlaps with other emotions (e.g., love, pleasure, attraction), and it has not been studied very thoroughly. Nevertheless, because of its simplicity, low cost, and reliability (in the measurement of some kind of transitory response to TV ads), we encourage additional research to refine and improve the validity of the Warmth Monitor method. Stayman and Aaker (1993) have adapted it to a dial-turning procedure and used it to measure other emotions. In addition, improvements in technology may make the use of seemingly complex techniques very easy in the near future. For example, a computer mouse or trackball could be used as the input device on a screen moving at a rate calibrated to the stimulus presentation. [Received October 1992. Revised June 1993.]

REFERENCES Aaker, David A., Douglas M. Stayman, and Michael R. Hagerty (1986), "Warmth in Advertising: Measurement, Impact, and Sequence Effects," Journal of Consumer Research, 12 (March), 365-381. Bagozzi, Richard P. (1988), "The Rebirth of Attitude Research in Marketing," Journal of the Market Research Society, 30 (April), 163-195. - - - (1991), "The Role of Psychophysiology in Consumer Research," in Handbook of Consumer Behavior, ed. Thomas S. Robertson and Harold H. Kassarjian, Englewood Cliffs, NJ: Prentice Hall, 124-161. - - - and Youjae Yi (1991), "Multitrait-Multimethod Matrices in Consumer Research," Journal of Consumer Research, 17 (March), 426-439. Batra, Rajeev (1986), "Affective Advertising: Role, Processes, and Measurement," in The Role of Affect in Consumer Behavior: Emerging Theories and Applications, ed. Robert A. Peterson et aI., Lexington, MA: Heath, 53-85.

599 - - - and Michael L. Ray (1986), "Affective Responses Mediating Acceptance of Advertising," Journal ofConsumer Research, 13 (September), 234-249. Ben-Shakhar, Gershon (1985), "Standardization within Individuals: A Simple Method to Neutralize Individual Differences in Skin Conductance," Psychophysiology, 22 (May), 292-299. Boyd, Thomas C. and G. David Hughes (1992), "Validating Realtime Response Measures," in Advances in Consumer Research, Vol. 19, ed. John F. Sherry, Jr., and Brian Sternthal, Provo, UT: Association for Consumer Research, 649-656. Edell, Julie A. and Marian Chapman Burke (1987), "The Power of Feelings in Understanding Advertising Effects," Journal of Consumer Research, 14 (December), 421-433. Fenwick, Ian and Marshall D. Rice (1991), "Reliability of Continuous Measurement Copy-Testing Methods," Journal of Advertising Research, 31 (February-March), 23-29. Holbrook, Morris B. (1986), "Emotion in the Consumer Experience: Towards a New Model of the Human Consumer," in The Role of Affect in Consumer Behavior: Emerging Theories and Applications, ed. Robert A. Peterson et aI., Lexington, MA: Heath, 17-25. - - - and Rajeev Batra (1987), "Assessing the Role of Emotions as Mediators of Consumer Responses to Advertising," Journal of Consumer Research, 14 (December), 404-420. - - - and John O'Shaughnessy (1984), "The Role of Emotion in Advertising," Psychology and Marketing, 1 (Summer), 45-64. Hughes, G. David (1992), "Realtime Response Measures of Television Commercials," Report 92-110, Marketing Science Institute, Cambridge, MA 02138-5396. Kroeber-Riel, Werner (1979), "Activation Research: Psychobiological Approaches in Consumer Research," Journal of Consumer Research, 5 (March), 240-250. - - - (1982), "Analysis of 'Non-cognitive' Behavior Especially by Non-verbal Measurement," working paper, Institute for Consumer and Behavioral Research, University of the Saarland, Saarbruecken. Nunnally, Jum C. (1978), Psychometric Theory, New York: McGraw-Hill. Ohman, Arne (1979), "The Orienting Response, Attention and Learning: An Information Processing Perspective," in The Orienting Reflex in Humans, ed. H. D. Kimmel et aI., Hillsdale, NJ: Erlbaum, 443-471. Peterman, Jack N. (1940), "The Program Analyzer: A New Technique in StUdying Liked and Disliked Items in Radio Programs," Journal of Applied Psychology, 24 (December),728-741. Petty, Richard E., John T. Cacioppo, and David Schumann (1983), "Central and Peripheral Routes to Advertising Effectiveness: The Moderating Role of Involvement," Journal of Consumer Research, 10 (September), 135-146. Plutchik, Robert (1980), Emotion-A Psychoevolutionary Synthesis, New York: Harper & Row. Rothschild, Michael L., J. Hyun Yong, Byron Reeves, Esther Thorson, and Robert Goldstein (1988), "Hemispherically Lateralized EEG as a Response to Television Commercials," Journal of Consumer Research, 15 (September), 185-198.

600 Siddle, David A. T. and John A. Spinks (1979), "Orienting Response and Information Processing: Some Theoretical and Empirical Problems," in The Orienting Reflex in Humans, ed. H. D. Kimmel etal., Hillsdale, NJ: Erlbaum, 557-564. Spinks, John A., Geoffrey H. Blowers, and Daniel T. L. Shek (1985), "The Role of the Orienting Response in the Anticipation of Information: A Skin Conductance Response Study," Psychophysiology, 22 (July), 385-394. Srinivasan, V., P. Vanden Abeele, and I. Butaye (1989), "The Factor Structure of Multidimensional Response to Marketing Stimuli: A Comparison of Two Approaches," Marketing Science, 8 (Winter), 78-88. Stayman, Douglas M. and David A. Aaker (1993), "Continuous Measurement of Self Report of Emotional Re-

JOURNAL OF CONSUMER RESEARCH

sponse," Psychology and Marketing, 10 (May-June), 199-214. Steenkamp, Jan-Benedict E. M. and Hans C. M. van Trijp (1991), "The Use of LISREL in Validating Marketing Constructs," International Journal of Research in Marketing, 8 (November), 283-299. Stewart, David W. and David H. Furse (1982), "Applying Psychophysiological Measures to Marketing and Advertising Research Problems," in Current Issues and Research in Advertising, ed. James H. Leigh and Claude R. Martin, Jr., Ann Arbor: University of Michigan Business School, 1-38. Zeitlin, David M. and Richard A. Westwood (1986), "Measuring Emotional Response," Journal ofAdvertising Research, 26 (October/November), 34-44.

Suggest Documents