DESCRIBING QUANTITATIVE DATA

Cnaprpn11 DESCRIBINGQUANTITATIVE DATA .,^ At the beginning of this textbook, we noted that we live in an "information society," bombardedby informat...
0 downloads 0 Views 5MB Size
Cnaprpn11

DESCRIBINGQUANTITATIVE DATA

.,^ At the beginning of this textbook, we noted that we live in an "information society," bombardedby information from all sides.Much of the information we seeor hear on a daily basis is in the form of "stats," all those numbers people use to describe things. For instance,you wake up to a television weather reporter saying it's 85 degrees outside, and pointing out that it's already 3 degrees wanner than the averagefor that day. On the way to work, you hear on the car radio that the Dow JonesIndustrial Average (DJIA, or Dow) fell 49.23 points yesterday.During a salesteam meeting at the offrce, a colleague displays charts and graphs that describe the amount and type of sales over the past few months. Back at home after work, the anchor on the early evening television news says that the latest poll shows that, unless somethingdramatic happensin the next few days, the mayoral election later that week is all but decided, as one candidate is preferred 557o to 40Vo over the other candidate,with only 5 Voundecided, and the margin of error for the poll is plus or minus 3Vo.You finally get a chance to read the newspaper, and see an Associated Press release that documents an important difference between women and men in the standard of living in the fust year after divorce, with women experiencing a 277o decline while men experience a l07o increase ("Divorce Researcher," 1996)' And later that night, while eating one of your favorite "comfort" foods and thinking you are safe from any more "stats," a television program about medicine saysthere is a substantialcorrelation-.7O is mentioned, but isri't explained-between eating that very food and heart disease!As these examples, and many others we could have offered, demon-

strate,we are exposedto quite a few statistics every day. Indeed, our society thrives on such informationl in the words of the membersof Generation X, "Stats ruIe." Some of these statistics, like temperature readings and arithmetic averages, are relatively straightforward, easy to understand, and trustworthy. We know how thesestatisticsareproducedand what they mean. But other statistics,such as estimating what a population believes from a relatively small sample (as in polling) or statistical testsof differencesbetween groups (e.g., men and women) and relationshipsbetweenvariables(food consumptionand heart disease),are more complex and difficult to understand. Most people don't know how thesestatisticsare produced and, therefore, don't know whether they are accurate or the conclusions drawn from them trustworthy. This chapter, and the subsequent chapters in this sectionof the textbook,is designedto help you understandvarious types of statisticsused to analyze quantitative (numerical) data. Our goal is to help you become a competent consumer of such information. We start by explaining the nature and purposesof statisticaldata analysis. MAKING SENSEOF NUMBERS: STATISTICAL DATA ANALYSIS Each researchmethodology we have examined in this text relies on various measurementtechniques (e.g., questionnaires,interviews, and observations) to acquiredata.A set of acquired data,however,is not very useful in itself; it needsto be analyzedand interpreted.Data analysis is the processof examining what data mean to researchers.Data analysis

289

290

PART FoUR

ANALYZING AND INTERPRETINGQUANTITATIVE DATA

transformsdatainto useful information that can be sharedwith others. Quantitative data lend themselves to dataanalytic proceduresassociatedclosely with the applied branch of mathematicsknown as statistics. Statistical data analysis, therefore, refers tg the processof examining what quantitativedata mean to researchers. The word "statistics" comes from the Latin root status,which implies that statisticsare usedto understandthe statusor stateof quantitative data. The earliest forms of organized statistics were of this type. The ancientcivilizations of Babylon and Egypt, for example, collected statistics on the number of people and livestock, among other things, for the purposesoftaxation and the raising of armies (seeM. Cowles, 1989). Today, our society produces a large number of statistics for many different reasons, including trying to count the number of people living in the United Statesfor the pu{posesoftaxation and, when there is a need for a draft, the raising of an army. Used in this sense, then, statistics refers to any numerical indicator of a setofdata. There are a number oftechniques and proceduresused to produce such statistics,such as the arithmetic average. Many people, in fact, equatethe term statistics withthese techniquesand procedures(Gephart, 1988).Let's call theseprocedures and the descriptive information they yield "descriptive statistics." But as M. Cowles (1989) points out, starting in the seventeenth century another activity was referencedby the term statistics: thepracticeof not only collectingand collatingnumericalfacts,butalsotheprocessofreasoningfrom them. Going beyondthe data, making inferences anddrawingconclusions with greateror lesserdegreesofcertainty in an orderlyandconsistentmonner is theaim ofmodemappliedstatistics.(p. 6) Such inference making was made possible by the developmentof sophisticatedtechniquesand procedures for analyzing quantitative data, such as estimationprocedures,examinedlater in this chapter, which allow researchers to generalize from a sampleto a population. Let's call theseprocedures

and the inferential information thev vield "inferential statistics." Drawing conclusions and making inferences is part and parcel of how we now processnumerical information. In fact, it's doubtful that there is such a thing as a "neutral" statistic, one about which we don't infer something.Even something as simple as knowing the temperature may lead to an inferenceabout whether it's a good day to go to the beach or a good day to stay indoors and get some work done. Today, statisticsare one of the most pervasive and persuasive forms of evidence used to make arguments.In fact, Abelson (1995) believesthat "the purpose of statisticsis to organize a useful argument from quantitative evidence, using a form of principled rhetoric" (p. xiii). Using someof the examplesthat openedthis chapter,a stockbrokermay urge a client to sell some stocksbecauseof a falling DJIA. The colleague at work may use the charts and graphs documenting sales to persuade your team to proceed in a certain way. The television news anchor actually made the claim, on the basis of poll statistics,that the upcoming election was essentiallyover. Statisticson the effects of divorce rateson standardofliving havebeenusedby policymakers to help bring about stricter childsupport enforcement and more flexible propertydivision laws ("Divorce Researcher,"1996). And ofthe basisof statisticalevidence,physicianswarn againsteating harmful foods. We are sureeveryone,including you, hasused statisticsat some point to support or refute an argument. Many people,in fact, may be wary of statistical information becauseit is usedso frequently as evidence in arguments.They believe, as Huff (1954) pointed out long ago,that statisticsis a way of lying with numbers.(According to Mark Twain, England's Prime Minister Benjamin Disraeli said there were three types of lies: "lies, damn lies, and statistics.") People, thus, doubt the "principled" nature of numerical evidence in arguments. And in many cases,well they should. If you turn on the television and see an advertisementboldly claiming that four out of five doctors recommend a certain headachemedicine, you should be wary of the

CHAPTER11 DESCRIBINGQUANTITATTVEDATA

rld "inferent inferences essnumenthat there is , one about r something may lead to day to go to ors and get st pervasive to make ariresthat "the rseful argug a form of neof the exibroker may rseof a fallray use the to persuade . The televilaim, on the iag election :ffects of diren usedby iicter childle property1996).And ;lclansWa"rn ou, has used efute an arwary of stao frequently :ve, as Huff fics ls a.way v{ark Twain, )israeli said mn lies, and 'principled" ents.And in tum on the oldly claimmend a cer: wary of the

claim. Not only don't we know how this "study" was done,but also we suspectthe agendaof selling the medicine had an iinportant effect on how those statistics were produced. (By the way, this claim also means that 2OVoof doctors don't recommend it-a pretty high figure. Imagine how effective the advertisementwould soundif it said,"Twenty percent of doctorsdo not recommendthis medicine.") We've probably all beenguilty of using statistics in a less-than-honestmanner, or what Huff (1954) calls "statisticulation," ihat is, misinforming people by manipulating statisticalinformation. So part of our distrust of statisticsmay be because each of us has used them as we fear they may be being used on us. But this doesn't mean we should reject statisticswheneverwe hear them. After all, as Abelson (1995) retorts, "When people lie with words (which they do quite often), we do not take it out on the English language" (p. 1). We don't startto distrust all words. We think a main reasonfor people'sdistrustof statistics, especially complex statistical information, is a lack of knowledge about how such information is produced and whether it is valid. There are also a number of people who suffer from what Paulos (1988) calls "innumeracy," the mathematical equivalent of illiteracy that leads them to be uncomfortable dealing with numbers and elements of chance. These problems mean that people often are not conf,rdentthat they can understand fully the value, or lack thereof, of particular statistics. As just one example, consider again the Dow Jones Industrial Average. Most people probably don't know that the Dow is simply the total stock prices of 30 representativestocks, not a very large number. And becauseit gives greaterweight to higher priced stocks,the Dow is a notoriously poor gauge of overall stock market activity (Maier' 1991). Stock market experts look more at the New York Stock Exchange Index or the Standardand Poors (S&P 500) Index, which are much better measures because they include many more stock prices and weight them by corporate size (Maier, 1991). For example, on Tuesday, December 15, 1998, the Dow fell 126.16 points, or l'4%o of its market value, but the day was actually much worse, as the

29I

S&P 500 declined25.56 points, or2'27o of its market value. And on Thursday,January2L, I999,the Dow dropped 19.31points while the S&P 500 rose 4.26 points. Most people, however, rely on the Dow, and the media continue to feature it. The lack of knowledge about statistical information results either in the tendency to reject the information completely or to accept it at face value. Both responsesdemonstratea lack of competence and confidence and place people at risk. What people needto do, instead,is to becomecompetentand confident consumersof statistical evidence, so that they can make informed decisions about whether to accept such information when they see or hear it. The first step,which we've just accomplished' is understanding that the wotd statistics refers to at least two things: (a) products-numerical descriptions and inferential statistics thal characteize a set of quantitative data, and (b) techniques-the. application of procedures to produce numerical descriptionsand statisticalinferences.We are now ready to take a look at some of these statistical techniquesand products, and the conclusionsand inferences drawn from them that comprise the statistics enterprise. To do so, we focus on the two general purposes of description and inference as reflected in two types of statistical data analysis.Descriptive statistical data analysis is used to construct simple descriptionsaboutthe characteristicsof a setof quantitative data. In this chapter, we explore three ways to describedata:(a) numerical indicators that summarize the data, caTledsummary stetistics, (b) converting raw scoresinto what are called standard.scores,and (c) constructingvisual displaysof data. Inferential statistical data analysis has two interrelated pulposes: (a) estimation----estimating the characteristics of a population from data gathered on a sample, and (b) significance testinStesting for significant statistical differences between groups and significant statistical relationships between variables.In Chapter 12, we cover the general principles of inferential statistical data analysis, while Chapters 13 and 14 examine the specific data-analytic procedures used to test for significant statistical differences and relationships.

292

PART FoUR

ANALYZING AND INTERPRETINGQUANTITATvE DATA

DESCRIBING DATA THROUGH SUMMARY STATISTICS

Measures of Central Tendency

We seemto be fascinatedin the United Stateswith When one has a relatively small data set, such as "the typical." As Weisberg (1992) explains, ,,We temperaturereadings for the previous five days, want to know what typical people do and think, one can verbally describethe entire data set by reperhaps so that we can be sure that we ourselves ferring to eachindividual entry (e.g.,Monday was a.renot unusualin our actionsand attitudes',(p. 1). 70 degrees,Tuesdaywas 60 degrees,and so forth). Researchers,too, are interested in what is 'typical" But when there is a furly large data set, it doesn't about a set of quantitative data. They make much senseto verbally give all the individual seek a summary statistic that provides the ,.typientries that comprise the set. For exarrple, the racal" or "average"point in a data set, the one score dio announcer doesn't tell how the stock market that best representsthe entire distribution of data. did yesterdayby reading off each and every stock. Intuitively, it makes sensethat the most represen(If you want that kind of detail, you have to read tative point would be somewheretoward the midthe newspaperor watch CNN or someother televidle of the data set, as opposed to being closer to sion channel that lists stock prices running across either of the extremes.Thus, for example, if a rethe bottom of the screen during daytime programsearchermeasured 100 organizational employees mirg.) Nor does the television news anchor,when on the quality of their communication with their describingwhat people said about how they would supervisor(say on a S-point scalethat rangesfrom vote, read off each of the 1600 (or however many very high to very low), the highest score wouldn't there are) individual responses.In both cases,and be a very good indicator of the ,,typical,, emmany others, the sheer arnount of information ployee's communication, and neither would the would be difficult, if not impossible,to process. lowest score. Either extreme would give an unrepThe data that comprise a relatively large data resentative view of that communication. A better set, therefore, must be condensedin some way to indicator of the typical employee,scommunication make senseof them. Hence, the DJIA (despiteits is somewheretoward the middle or center of all the flaws) is used to describe how millions of stocks scores. The researcher would want to use this and tens of millions of stockmarket transactions "middle" score to best representthe ,.centraltendid on any-particular day. And the percentageof dency" of the distribution. Hence, measures of people who said they would vote for one candidate central tendency (also called representative valversusthe other is usedto summarizepeople's anues) describe the center point of a distribution of swers to a poll. In each case,the data have been quantitative data. condensedto a numerical indicator that best sumThere are a number of measuresof central tenmarizesthe data set, or what is called a summary dency, but the three most common types are the statistic. Summary statistics,thus, provide an effimode, median, and mean. Each is a summary stacient way to describe an entire set of quantitative tistic that identifies the centerpoint of an array of data. quantitative data, an ordered display of numerical There are two types of summary statistics that measurements (e.g.,the distribution5,3,l0, j be_ a.remost important for our purposes:measuresof comes an array of data when put in either an as_ central tendency and measuresof dispersion.The cending orde4 beginrnng with the lowest number fust type provides a numerical indicator that deand moving to the highestnumber-3, 5, j ,l0-or scribesthe "center" of a data set, while the second a descending orde4 beginrung with the highest type provides a numerical indicator of how much number and moving to the lowest number-1O, 7, the scoresin a data set differ from the center point, 5, 3). Which measureof central tendency is used that is, how tight or spread out those scores are dependson the type of data collected. Remember from the center point. from Chapter 4 that there are four types of mea-

293

CHAPTER11 DESCRIBINGQUANTITATTVEDATA

itates with Lains,"We and think, ourselves les"(p. 1). n what is tata. They the "typione score rn of data. represenI the midt closer to le, if a re:mployees with their ngesfrom : wouldn't ical" emwould the an unrep. A better runication r of all the r use this :nffal tenasures of ative valibution of entral tenes are the mary star array of numerical ,10,1 be1er an 4.r;t number 7,10-or e highest .r-1O

7

:y is used Lemember s of mea-

surementscales:(a) nominal scales,which classify a variable into different categories; (b) ordinal scales,which rank ordei categoriesalong somedimension; (c) interval scales,which establishequal distancebetweeneachofthe adjacentpoints along the scale and have an arbitrary zero point; and (d) ratio scales,which are like interval'scalesbut establishan absolutezero point where the variable being measuredceasesto exist. Let's examineeach of the three measures of central tendency and the type of data for which they are most useful. Mode. The mode (Mo)rthe simplest measureof central tendency,indicateswhich scoreor value in a distribution occurs most frequently. It literally points out the category or numerical score that is most typical in terms of the number of times it appears. The mode is an appropriate measureof central tendency when describing a set of nominal data. Because nominal data are in the form of categories, no mathematicalanalyses(such as computing the mean,the arithmetic average,discussedbelow) can be done; all one can do is indicate which category occurredthe most. For example, as part of an attempt to understand more fully the important problem of peer sexual harassment,Ivy and Hamlet (1996) asked 163 studentsto indicate which behaviors from a list of 15 (identifredfrom previousresearch)did or did not constifutesexualharassment(by answering "yes" or "no"). The researcherscountedthe actual number of "yes" and "no" answersfor eachbehavior (seeFigure 11.1),As you see,the categoryof "sexual assault"receivedthe most amountof "yes" responsesand, correspondingly,the least amount of "no" responses.Becausethat categoryoccurred most frequently,it is the mode for this distribution. It is the category that is most typical of this data set.(By the way, the categoryof "humor andjoke" is the antimode, Vogt, 1999, becauseit received the least amount of "yes" responsesand, correspondingly, the most amount of "no" responses. The antimode is often helpful for knowing, for example, what people least Prefer.)

BEHAVIOR

or insultingvocalsounds Suggestive manner Whistlingin a suggestive beingaskedout on a Repeatedly date,evenaftersayingno Humor& jokesaboutsexor about womenand/ormen,in general or invitations, Sexualpropositions, for sexualactivity otherpressures of a personal Continualcompliments and/orsexualnature Staring res Making obsceneBestu and/or personal about Questions sexuallife lmoliedor overtsexualthreats Patting, stroking,pinching,& other similarformsof touch Brushingup againstthe bodY Attemptedor actualkissing Sexualassault Forcedsexualintercourse

FREQ. FREQ. /YES" zNO/

127 93 109

35 69 54

76

83

146

16

123

37

93 140 125

69 21 35

146 147

15 14

124 132 151 149

36 29 10 12

Hamlet, fromDianaK. lvy andStephen Source:Adapted of TwoStudies andSexualDynamics: "CollegeStudents 45 Education, Communication PeerSexualfnvolvement," AsCommunication (2),p.159.Copyright bytheNational of thepublisher. by permission 1996.Reprinted sociation.

Whereas the mode is the appropriate measure tendency for a set of nominal data, it is cenffal of very useful when applied to ordinal usually not very limited use for describing the and is of data of intervaVratio data. For example, of a set center nine horses,let's assumeone horse race among in a finishes second, and so on, another first, finishes ninth. Becauseeach finishes that the horse down to set of ordinal data, in this once appears rank only mode, in this case,thus, The mode. is the eachrank tied for fifth, horses if two Even is meaningless. if two although much, not tell that mode would (Notice, however, it would. first, horses tied for that if the data were the number of times a horse

294

PART F.UR

ANALYZING AND INTERPRETINGQUANTITATIVE DATA

finished in first place, secondplace, and so forth, this would be nominal data, andthe mode, the category that occurredmost frequently, would be the appropriate surnmary statistic for describing the central tendencyof this data set.) The mode can also be applied to intervaVratio data to describe the score that occurs th'e most amount of time. Suppose,for example, that a researcher wants to know how much information people recall from a public service announcement (PSA) aboutthe adverseeffectsoftobacco. The researchershowssevenpeoplea pSA that contains21 important pieces of information, and then asks them one hour later to recall that information. The researchercounts the number of correct pieces of information recalled (and, thereby, produces ratio data, as there is an absolutezero point), and finds thefollowing distributionfor the sevenparticipants: Pl

P2

P3

P4

Ps

P6

P7

20

12

11

9

5

5

3

In this distribution, the mode is ,.5', becauseit occurs more than any other score.As you can see, the mode doesnot appearto be a very good expression of the center of this distribution. If we had to chooseone number from this distributionto describe the central tendency,the number ..9,'would appearto be a much betterchoice. And if either extreme score (20 or 3) occurredthree times, then it would be the mode and, of course,an evenweaker description of the central tendency than the number "5," since the new mode would also be tne most extremepoint in the distribution. Thus, it usually makes little senseto use the mode with a set of ordinal, interval, or ratio data. As one final example, consider the amount of money people make in the neighborhood where you live. The mode could be used to describethe central tendencyofthis data set,but no two people may make the exact amount so there may be no mode. Or suppose two people make exactly the sameamountand that happensto be the lowest pay in the neighborhood.In that case,the mode would be a poor measureof the typical income of people living in that neighborhood.For reasonslike this, the mode is seldom used to describe the central tendency of ordinal, interval, or ratio data.

Another problem with the mode for nominal, interval, or ratio data is that there may be more than one category or score that occurs the same amount of times. For example, if there were two 20s and two 5s in the PSA information recall data, there would be two modes. Such a distribution is called bimodal, in contrastto a unimodal distribution in which there is only one mode. And if there were two 20s, two 12s, and two 5s, it is a multimodal distribution. In such cases,although the mode still helps us to understand the data, it provides even less useful information about the center point of the data. Mercifully, multimodal data are seldom reported in academic literature. In summary, the mode is an appropriate measure of the central tendency of a set of nominal data that tells researcherswhich category occurs the most. The mode, however, cannot be used to describe the center point of a set of ordinal data (when each rank only occurs once), and typically is not a very effective measure of the center point of a set of intervaVratio data. Median. The median (Md or Mitn) divides a distribution ofquantitative data exactly in half. It is the score above which and below which half the observations fall. It literally locates the middle case, which is why it sometimes is referred to as a location,positional, or ordermeasure(Weisberg,1992). The median is very appropriatefor describing the center point of a set of ordinal (rank-ordered) data and often is effective for describing the center point of intervaVratio data. In the case of ordinal data, dividing the possible ranks at the midpoint does,indeed, give us the center of the data. In tne horse race example above, the horse that finished fifth out of nine horses is the median, as four horsesfinished aboveit and four finished below it. This samelogic and utility applies to a set of intervaUratio data. In the distribution about information recall presentedearlier,the median is 9, because half the distribution falls above this score and half below it. (Now we see why that number appearedso tempting when we were talking about the mode.) In a set of ordinal or intervaUratio data that containsan odd number of scores,the median is an

CHAPTER11 DESCRIBINGQUANTITATIVEDATA

)r nominal, ty be more s the same ) were two recall data, tribution is dal distride. And if 5s, it is a s, although the data, it about the nultimodal terature. )nate mea:f nominal ory occurs be used to rdinal data d typically enterpoint

rides a disuif. It is the ithe obseriddle case, t as a locaetg,1992). describing k-ordered) :the center of ordinal : midpoint lata. In the at finished n. as four d below it. to a set of rout inforanis 9, bethis score at number king about l data that edian is an

actual score in the distribution (e.g., the number "9" is an actual scorein the information recall distribution). When there are an even number of observations,the median is found by taking the two middle numbers, adding them together, and dividing by 2. Thus, for example, if the number 9 is removed from the PSA recall information distribution (leaving20,12,11, 5, 5, 3), the medianis found by taking the two middle numbers (11 and 5), adding them together (16), 4nd dividing by 2, which equals8. So 8 is the median of this distribution. Hence, when there are an even number of scoresin a distribution, the median is not a value that appearsin the distribution' It is a calculated value halfway betweenthe two middle scores. The median often makes good senseas an indicator of the central tendency of intervaVratio data. One advantagethe median has over the mode in this caseis that it takesinto accountthe entire set of scoresin a distribution, rather than only one or a few scores,as doesthe mode. The median considers the entire set of scoresand divides it at the halfway point. A more important advantage is that the median is not swayedby extreme scores,or what are called outliers. For example,the scoreof 20 in the PSA information recall scoresseemsto be a pretty extreme score as compared to the other scoresin that distribution. But this doesn't affect the median per se. Or, let's suppose there are 100 possible piecesof information in the PSA and all the scores remainedthe same,exceptthat the high scorewas 80 insteadof 20. The median (9) would still be the same.Some measuresof central tendency,such as the mean (discussedbelow), aregreatly affectedby extremescores,but the medianis not, and is, therefore, consideredto be a resistant statistic (Weisbery,1992). The median, thus, is a very useful measureof central tendency when a set of intervaUntio data containvery extremescores.For example,the U'S. CensusBureau (1999b) reported that the median householdincome for 1997 was $37,303.The government uses the median to describe the "average" householdincome becausethe few billionaires and millionaires in the population would drive the mean sky-high, maybe to $100,000 or even much

295

higher. This figure, however (and unfortunately), just doesn't describevery well the "typical ercge" householdincome. Hence, governmentofficials divide the distribution of all reported household incomes at the halfway point and use this median to describe the "average" household lncome. This example also shows that the term "average" is usedas a synonym for any measureof central tendency.Most peoplethink of an "average"as the arithmetic average (the mean) of a set of data, but in statistics,"average" meansthe "typical" or "most representative"category or score that describesthe central tendencyof a distribution. Thus, the mode. median, and mean are each an "average." So the next time the term "average" is used as statisticalevidencein an argument,check to see which measureof centraltendencyis being used. The ability of the median to resist being influenced by extreme scores,however, is also its primary weakness,at leastin comparisonto the mean. For example, if the score of 20 in the information recall distribution is changedto 19, the median is still 9. But as we will see below, the mean would change. The median, therefore, is not sensitive to changesin the scoresthat comprisethe data set. In summary, the median is usually the appropriate measureof central tendency for a set of ordinal data. It is also often quite useful for describing intervaUratio data, especially when the distribution has an extreme score(s)that would lead the mean to not representthe center point very well' Mean. Themean(X; samplemean,Ml popula' tion mean, p) is the arithmetic average,and is computed by adding all the scores in a distribution of intervaVratio data and dividing by the total number of scores.The scorescan be addedtogetherand divided in this manner becausethe points on an intervaVratio measurement scale are of equal distance (e.g., there is the same amount of difference between 4 and 5 as there is between27 and 28). In the PSA information recall (ratio) data above,the mean is found by addingup all the scores(20 + 12 + 11 + 9 + 5 + 5 + 3 = 65) and dividing this total by the number of scoreson which it is based (7), which equals 9.29. (Typically, means are reported using

296

.

.

PART FoUR

ANALYZING AND INTERPRETINGQUANTITATIVE DATA

either one or two decimal points, and are rounded offto the nearestfigure; the meanis actualty9.285, so it is rounded off, in this caie up to 9.29.) The mean usually is the most appropriateand effective measure of central tendency of a set of intervaVratiodata precisely becauseit is sensitive to every change in the data. For example, if the number 20 is changedto 19, the total is now 64 and the mean is f.i4. The mean, therefore, changesif any ntmber in a distribution changes;the mode and median, as we have seen,don't ha'iethis property. The mean, therefore, is considered to be a nonresistant statistic (nonresistant to extreme scores),in contrast to the resistant statistic of the median (Weisberg,1992). Of course,as explainedpreviously,the mean's sensitivity to all scoresin a distribution is a problem when there are extreme scoresthat lead the mean to be an atypical or unrepresentative middle point (such as in the example of householdincome). In such cases,the median is used to describethe central tendency, although there are some types of meaas.(otherthan the arithmetic average)and procedures that can be used to deal with extreme scores (e.g., a trimmed mean removes the most extreme scores entirely and winsorizing the data changesthe most extremescoresto the next lessextreme scores).It should also be pointed out that there are other types of means appropriate for particular kinds of ratio data(e.g.,geometric mean for measuring relative change and harmonic mean when averagingrates) (seeWeisberg,IggZ). One more minor caution about the mean. As you see from the calculations here, the mean most often results in a fractional value (e.g., 9.29 rather than 9). This is a potential problem only to the extent that it causesdifficulties in interpreting statistical information. For example, the U.S. Census Bureau (1999a) reported that the average size of a household in the United Statesin 1998 was 2.62 people. While we might suspectthat some members ofhouseholds "aren't all there," it isn't physically possibleto havea .62 personliving in a house! In summary, the mean is the most sensitive

as long asthe datadon't contain an extremecase(s) that throws off the mean too much, making it unrepresentativeof the centerpoint of the data. Measures of Dispersion Measuresof central tendency describethe middle or central point of a distribution of quantitative data. When the data are in the form of ordinal, interval, or ratio measurements,the mode, median, and mean try to tell us the typical or most representative score ofthe distribution. But just giving the central tendencysufirmary. statistic by itself can be misleading. For example, supposewe ask four married couples, divided on the basisof whether they are young (say 18-35) or middle-aged (say 36-65), how satisfied they are with the communication in their relationship on a 5-point scale,where 1 = extremely dissatisfied,2 = dissatisfied,3 = neither dissatisfiednor satisfied,4 = satisfied, and 5 = extremely satisfied. The following scoresare obtained: YoungCouples CoupleA: 3 3 CoupleB: 3 J

Middle-AgedCouples CoupleC: 5 1 Couple D: 1 5

If the mean of eachgroup is calculated,we arrive at the sameexact number (3). If we just relied on the mean,we would have to concludethat these goups are equivalent and that the couples are neither satisfiednor unsatisfiedwith the communication in their relationship. But clearly, these two groups are not equivalent. In the case of the two young couples, each pa.rtneris moderately satished,whereas one person in each of the two middle-aged couplesis extremely satisfied while the other is extremely dissatisfied.Just using the mean to describe each group overlooks an important difference: that the dispersion or spread of the scoresof the middleaged couplesfrom their distribution's centerpoint (the mean) is much greater than the dispersion or spread of the scores of the young couples from their distribution's center point. (By the way, re-

CHAPTERll

tleme case(s) naking it unhe data.

rc the middle ' quantitative rf ordinal, inode, median, tostrepresenIcy surnmary lor example, s, divided on ay 18-35) or Eed they are ionship on a satisfied,2 = lr satisfied,4 ied. The folCouples l:5 I ): 1 5 Llated,we arre just relied de that these .plesare neicommunica: not equrivacuples, each 3asone per:uples is exs extremely ;scribe each lce: that the the middlecenterpoint t.ispersionor ouples from the way, re-

searchersusedto calculatecouple's satisfactionby averagingindividuals' scores,but now they tend to use the lowest score,ieasoning that if one partner is dissatisfied,the relationship should be classified as dissatisfying.) Measures of central tendency,therefore, while identifying the typical or most representative score,tell us nothing about how the scoresin a distribution differ or vary from this representative score.This is fine as long as all,the scoreswithin a distribution are exactly the sarne'(suchas with the young couples above). But variety is the spice of life, and the same is ffue of distributions-they contain scoresthat vary from the centerpoint. What would be most helpful, then, is someindication ofhow typical (or representative)the typical score actually is. Once again, it usually isn't feasibleto describehow each individual scorein a large data set varies from the center point, so researchersuse summary statistics to describe the extent to which the scores as a set vary from the center point of that distribution. Measures of dis' persion (also called measures of variability) report how much scoresvary from each other or how far they are spreadaround the center point of a data set and acrossthat distribution. Measures of dispersion typicaily are applied to ordinal, interval, and ratio data becausethese scalesuse ordered numbers that vary. There is a measureof dispersionfor nominal data, called the variation ratio, but it only indicates the relative frequency of the nonmodal scores in a distribution (e.g., if the relative frequency of the modal score, the proportion of total scores in a distribution accounted for by the modal category out of 1.00 or 1007o.is .30. the variationratio is .70) and, thus, is not used very much. While there are a number of measuresof dispersionthat apply to scalesthat use orderednumbers,three common measuresare the range,variance,and standarddeviation. Range. The range (or span), the simplest measure of dispersion, reports the distance between the highest and lowest scores in a distribution. The range, therefore, is calculated by subtracting the lowest number from the hishest number in a distri-

DESCRIBINGQUANTITATIVEDATA

297

bution (these are called extreme values). For example, the range for the PSA information recall datadiscussedearlier(20,12, |1,9,5, 5, 3) is found by subtracting3 from 20, which equals17.This can also be expressedas "a rangefrom 3 to 20." The range gives a generalsenseofhow much the dataspreadout acrossa distribution, which can be helpful for understanding whether a study included a lot of variability or whether it drew from a naffow spectrum.For example,if a researcherintends to study a communication variable acrossa wide range of age groups,a sampleofpeople aged 18-21 (range = 3) is not very diverse, whereas a sample of people aged 12-70 (range = 58) potentially is. We say "potentially" becausediversity, in this case,would dependon whether there are people representedthroughout that age spectrum. A sample of people aged 12, 13, 14, 15, 16, I7, and 70, although it has a range of 58, obviously is not very diverse,whereasa sampleof people aged12, 27,35,48, 52,63, and 70, which alsohas a range of 58, is far more diverse. Researcherssometimes report the range for the variables studied as this can potentially be helpful for evaluatingthe validity ofresearch. For example, Barnett, Chang, Fink, and Richards (1991) examined seasonalpatterns of television viewing, believing that people watched more television during the winter (where it is cold in much of the United States)than during the summer. To test this prediction, Barnett et al. used the Nielson Television Index of the average daily viewing hours (the number of hours television sets are turned on) per household, given at monthly intervals from September, 1950, to December, 1988. The range for this distribution of data was, thus, 38 years and 4 months, quite a large range. They also gathered data on the averagemonthly temperature and the total monthly precipitation from publications by the National Oceanic and Atmospheric Administration, and they calculated the number of minutes of daylight on the basis of sunrise and sunsetfor the 15th of each month as reported by the Sr. I'ouis Post-Dispatch newspaper. As Figure 11.2 shows, there are large differences between the maximum and the minimum scores

298

PART FoUR

ANALYZING AND INTERPRETINGQUANTITATIVE DATA

for each of these four variables. The range for each of these variables is: (a) television viewing = 267, (b) daylight = 322, k) temperature= 53.6, and (d) precipitation = 3.6I. The large ranges suggestthat theseresearchersexamined a diverse set of scores for each variable, although, as.the previous example about age showed, we would need more specific information before this could be concluded. Knowing the range can also provide a general understandingof how the scoresof twb-groups of people vary. For example, in the communication satisfactionscoresgiven abovefor young and middle-aged couples, the range of scoresis 0 for the young couples and 4 for the middle-agedcouples. This summary statistic of dispersion confirms what we saw when we eyeballed the data-that these two groups ofcouples are not equally satisfied. The problem with the range is that it is a nonresistant measurethat is overly sensitive to extreme scores.Hence, one or two extreme scoresmake a distribution look more dispersedthat it actually is. For example, in the information recall distribution of 20,12,11,9,5,5, 3, therangeis 17,but suppose the PSA contained 100 pieces of information, and while the other scoresremained the same, the person who scored20 scored 100. The range is now 97, but this is misleading,for therearen't any actual casesin betweenthe scoresof 72 and 100.Equally important, the range is not sensitive to differences

between scorcs within a distribution. Any score other than the highestand lowest scorescan change without the range changing. So the range is not a very sensitivemeasureof dispersion. To compensatefor these problems, researchers can divide a distribution at the median (the halfway point) and calculate the range of values between the median and the lowest score, called the lowspread, and the range of scores between the median and the highest score,called the highspread. It is more common, however, to divide a distribution at the 25th percentile (the point below which 25Voof the scoresfall) and at the 75th percentile (the point above which 257o of the scores fall). Calculating the distance between these two percentiles yields the interquartile range (IQR; also called the quartile deviation or midspread), the point in the distribution that divides the scores at the 25th and 75th percentile and is called the lower and upper hinge,respectively.This summary statisticrepresentsthe range of scoresfor the middle half of a distribution. The IQR, therefore, is less affected by extreme Cases,making it a more resistantmeasureof dispersionthan the range per se.And if finer gradationsare desired,researchers can divide the interquartile range exactly in half to produce the semi-interquartile range or divide the distribution into deciles, 10 equal parts. The range and its variations are a good first step toward understanding the amount of disper-

DESCRIPTIVE STATISTICS FORITLEVISIONVIEWINCAND ENVIRONMENTAL DATA

Variable Television viewing Daylight Temperature Precipitation

Mean

351.552 727.478 52.940 2.419

SD 60.086 111.745 15.288 0.561

Maximum

478.20 900.00 76.40 4.15

Minimum 211.20 578.00 22.80 0.54

source: ceorge A. Barnett,Hsui-Jung Chang,EdwardL. Fink,and william D Richards, Jr.,"seasonafity in TelevisionViewing:A MathematicalModel of CulturalProcesses," CommunicationRe.1991 search,18(6),p.760, copyrighto by sage Publications. Used by permission of sage Publications. NOTE:N = 460. AII variablesreflectmonthlyaverages for the UnitedStatesSeetext for complete descriptionand datasources.

CHAPTER11 DESCRIBINGQUANTITATTVEDATA

Any score can change rge is not a ;, researchredian (the e of values :ore, called ls between d the highto divide a rcint below e 75th perthe scores Lthese two nge (IQR; ridspread), ; the scores called the s summary or the midrerefore, is I 1t a more l,range per researchers y in half to r or divide arts. good first of disper.

sion or variance in a set of ordered measurements and, in fact, are the only measureof dispersion that can be applied meaningfully to ordinal (rankordered) data. But the range is also limited in describing the amount of dispersion in a setof intewal/ ratio data. It is similar to the median in,that way: Just as the median is fairly limited as a measureof cenffal tendency of intervaL/ratio data, so too is the range (and its variations) fairly limited as a measure of dispersionofintervaUratio datp,for it only gives researchersa global picture of the amount of dispersion.What is neededare measuresof dispersion that are as sensitiveas the mean is as a measureof central tendency; that is, a measureof dispersion that takes into account all the individual scoresin a distribution. For this, researchersturn to the measures of dispersion known as vaiance and standnrd deviation. Vuriance. Variance (sample Yariancer 52, s2; population variance, 02, pronounced "sigma squared") is a mathematical index of the average distance of the scores in a distribution of interval/ ratio data from the mean, in squaredunits. That definition probably doesn't make much sense,so let's walk through it and make it clear. We'll use an easierexample (one that doesn't contain a mean with a fraction) to explain variance and standarddeviation. Supposea researcherasks all five campaign managersat a public relations firm to keep track of the number of positive local newspaperreports written about their campaigns over the course of a month. At the end of that month, the five managers (P") report the following numbersof positive newspaperreports: P1 1

0

P2

P3

P4

Ps

7

5

2

1

Figure 11.3 shows how variance is calculated for this distribution, using both the def,rnitional formula and an easier computational formula. Walking through the calculation for the definitional formula will help Io get a handle on it. Common sensetells us that each individual score differs from the mean (once in a while, as in this distribution, one or more scoresare the sameas

299

the mean, but the other scores differ from it). The amount that one score differs from the mean is called its deviation score (deviate). For these five scores,the mean is 5; the amount each score differs fromthis mean,therefore, is its deviation score.The first score of 10 deviates from the group mean of 5 by +5 points. The deviation scores for all the raw scores(startingwith 10 and working down) are:+5, +2,0,-3,4. (Notethatwe indicatewhetherthe deviation score is greater or less than the mean by using positive and negativesigns.) So far, so good, but here's the problem. Add up all the deviation scores and you get zero, becausethe negativescoresbelow the mean equalthe positive scoresaboveit. This is always the casefor deviation scoresin any distribution. So we cannot sum the positive and negative scoresto obtain an overall deviation score,becausethat would make it look like no variation exists. But you can clearly seethat the scoresdo indeed vary from the mean. The bestway to handlethis is quite simple; we square the deviation scores (that is, multiply the deviation score by itself) and add up the squared scores. Squaring the deviation scores accomplishes two things. First, it converts all deviation scores to positive numbers (a negative number multiplied by itself becomes a positive number). Second,it preservesthe original information about deviation and keepsthe differencesbetweenscores intact. We can now sum the squared deviations in the distribution. The resulting total (54) is called the sum of squares (SS). The catch is that the sum of squaresscore is both cumulative and expressedin squaredunits. To find out how much the scoresas a set deviate "on the average" from the mean, we must divide the sum of squaresscore (54) by the number of scores (5) in the distribution. The result (10.8) is the average,or mean, of the squareddeviations, which is called the variance. (Note: When calculating the variance or standard deviation for a population, as we did, the sum of squaresscoreis divided by the total number of scores [M. When calculating the variance or standard deviation for a sample drawn from a population, the sum of squaresis divided by the total number of scores minus one [n - Il.

----N-(A) DefinitionalFormulaoz , =-2(X-x)' STEPS: 1. Findthe meanfor the groupof scores(equals5). 2. Subtractthe mean from each score,noting whether the score is greateror lessthan the mean by using positiveand negativesigns.This yields the deviationscore. 3. Squareeachdeviationscore{multiplyitby'itselflandaddthesquaredscorestogetthesumof squares(equals54). 4. Divide the sum of squaresscoreby the number of scores(N = 5; to get the variance(equals 10.8).(Note: Use the number of scoreswhen calculatingvariancefor a population;use the numberof scoresminusone when qalculatingthe variancefor a sample.)

Scores (X)

Score- Mean (X - X) (DeviationScores)

10 7 (

+5 +2 0 -3 -4

2 'I

- 2

(Score- Mean)2(X - X) (DeviationScoresSquared) 25 4 0 9 16

> ( x - x ) -= s 4

X =5

(Sumof Squares) Variance=54+5=10.8 (B) ComputationalFormula

o'" =,

f - .

+-X'

STEPS: 1. Squareeach scoreand add up the squaredscoresto get the sum of squares(179). Divide this sum of squaresby the numberof scores(N = 5) to get the firstpart ofthe equation(equals35.8) 2. Findthe meanfor the group of scores(equals5), and squarethis mean score(equals25) to get the secondpart of the equation. 3. Subtractthe valueobtainedfrom step2 (25)fromthe valueobtainedfor step.l (35.8)to get the variance(equals10.8).

Scores(X)

SquaredScores(X)z

10 7 2 1

100 49 25 4 1

X =5

2X2 = 179 (Sumof Squares)

q

X-=25

>f T-

179+5 = 35.8

V a r i a n c e= 3 5 . 8 - 2 5 = 1 0 . 8

CHAPTER11 DESCRIBINGQUANTITATI\EDATA

Capital N is used for a population; lowercasen is usedfor a sample.) A high variance tells researchers that most scores are far away from the mean; a low variance indicatesthat most scorescluster tightly about the mean.But a variancescoreis confusingbecauseit is expressedin units of squareddeviationsabout the mean, which are not the sameas the original unit of measurement.(We now seethat if the definition for variancesoundedlike "double-talk," that's because it literally is!) What is neededif a measureof dispersion expressedin the sameunits as the original measurement.For that, we turn to the measure of dispersionknown as standarddeviation. Stqndard Deviation. Standard deviation (SD; sample standard deviation, s,' population standard deviation, o) is a measureof dispersionthat explains how much scoresin a set of intervalJtatio datavary from the mean,expressedin the original unit of measurement.The standarddeviation of a distribution is found by taking the square root of the variance:that is. the number that when multiplied by itself equalsthe variance. In the example of the five newspapercoverage scores(seeFigure 11.3),the standarddeviation of this population (o) equalsthe squareroot of 10.8, or 3.29. This figure can be thought of as the average amount and, therefore,the best description of the dispersionof this distribution (most managers had between 1.77 and 8.29 positive newspaperreports; the mean minus and plus one standarddeviation), just as the mean (5 reports) is the average score and, therefore, the best description of the central tendencyofthis distribution (although this analogyis not precisely correct). The mean and standarddeviation are the measuresof cenffal tendency and dispersion reported most frequently by researcherswhen analyzinginterval/ratio data. They are often reported in a table, which is helpful when scoresfor different conditions or groups of people are compared.Mongeau and Carey (1996), for example, investigatedhow initiation of a first date influences people's expectations about the occuffence of sexual activity on that date. They first randomly assignedmen and

3(T

women to read one of three initiation scenarios: (a) female asks(the female askedthe male out on a date to a movie they had discussed),(b) female hints (the female indicated her interest in seeing the movie, followed immediately by the male asking her on the date), or (c) male asks (the male asked the female on the date without a preceding hint). Half the male and female participants then evaluated the male target and the other half evaluated the female target. Part of that evaluation included a l2-item scale that assessedsexual expectationsfor the date, ranging from 0 (wants no sexual or physical activity to occur on the date) to 12 (wants to engagein sexual intercourse on the date).The researchers,thus, conducteda 3 (initiation) X 2 (genderof target) X 2 (genderof participant) experiment,creating 12 conditionsin all (see Chapter7). As part of the analysisand reporting of the data,Mongeau and Carey included a table that presentsthe meansand standarddeviations(along with the number of participants)for these 12 conditions (seeFigure 11.4). This table makes it easy to see some of the differences among the conditions, such as the way males approacha first date with heightened sexual expectations, especially when a female initiates the date.

DESCRIBING DATA IN STANDARD SCORES Calculating the mean and standard deviation for a set of interval/ratio measurements allows for an interesting and important manipulation of data. Researchersoften report how many standarddeviations a parlicular score is above or below the mean of its distribution; this type,of scoreis called a standard score (or standard normal deviates). Standard scoresprovide a common unit of measurement that indicates how far any particular scoreis away from its mean. There are many types of standardscores(e.g., Zscores and staninescale; seeJaeger,1990). One that is usedfrequently by researchersis thee-score. The formula for z-scoresis: V- X) t = --SD-

302

PART FOUR

ANALYZING AND INTERPRETINGQUANTITATTVEDATA

THEIMPACToF lNlTlATloN TYPE, GENDER oF TARGET, AND cENDEROF PARTIC|PANT ON PERCEPTIONS OF SEXUATINTEREST ON THE DATE Male Participant Male Target

FemaleAsks

M = 7.56 5D = 4.69 n=18

Female Participant

Female Target

Male Target

Female Target

M = 2.68

M = 2.96 SD = 3.92

M = 2.58 sD = 2.21 n=24

c n-t n

10

-

)')

n -

l?

I

FemaleHints

M = 4.46 n=24

M = 2.91 S D= 2 . 5 6 n=22

M = 3.04 S D= 3 . 2 6 n=21

M = 1.53 sD = 0.96 n=19

M = 4.19 S D= 3 . 7 5 n=26

M = 2.75 5D= 1.83 n=20

M = 2.52 S D= 1 . 3 6 n=21

M =2.11 sD = 1.07 n=28

5U=J./7

MaleAsks

S o u r c e : P a u l A . M o n g e a u a n d C o l l e e n M C a r e y ," w h o ' s w o o i n g W h o m l l ? A n E x p e r i m e n t a l Investigationof Date-lnitiation and ExpectancyViolation," WesternJournal of Communtication, 60(3), p . 2 0 4 , c o p y r i g h tO 1 9 9 6 b y t h e W e s t e r nS t a t e sC o m m u n i c a t i o n A s s o c i a t i o n .U s e d bv oermission o f t h e W e s t e r n S t a t e sC o m m u n i c a t i o n A s s o c i a t i o n . NOTE: Scores can range from 0 (wants no sexual or physical activity to occur on the date) to l2 (wants to engage in sexual intercourse on the date).

The formula says that a z-scorecan be computed for any scorein a distribution by dividing its deviation score (how much the individual score differs from the mean) (X - X) by rhe standard deviation for the distribution (SD). Using rhe dara presentedearlier on number of positive newspaper repofts (70, l, 5, 2, I), dividing the deviation scores(+5, +2,0, -3, -4, respectively)by the standard deviation(3.29) producesz-scoresof +1.52, +.61,0, -.91, and-7.22,respectively. Eachz-score indicateshow many standarddeviationsthat score is from the mean of this distribution. If you calculate the meanofthis setofz-scores,it equalszero, which is why it is called a "Z"-score. Standardscores,like z-scores,are used by researchersin some important ways. First, researchers can meaningfully interpret an individual score within a distribution by showing how many standarddeviationsit is away from the meanof that distribution. For example, a standardscore of -1.96 tells researchersthat the individual score in ouestion is almost two standarddeviations below the

\

mean.In the next chapter,when we discussnormal distributions, the significance of this will becorne clear, for it means that, under normal conditions, only 2.57oof the population scorethis badly. One of the best-knownexamplesof using standard scores in this way is the deviation-Ie scate, which is usedwith many group-administeredintelligence (IQ) tests(seeJaeger,1990).On the basisof numerousnational samples,this scale is designed to have a mean scoreof 100 and a standarddeviation of 15. So a personwho scores115 has a standard score+1.00, while a personwho scores85 has a standardscore of -1.00. One advantaseof these standardIQ scoresis that becausethe peicentageof people who achieve any particular standardscore is known (which, again,we will explainin Chapter12 when talking about normal distributions), the percentile rank caa be calculated for any standard score.For example,a personwho has a +1.00 standard scoreis in the 84th percentile,because50Voof people score below the mean and approximately another34Toscorebetweenthe meanand+1.00SD.

5

CHAPTERlI DESCRIBINGQUANTITATWEDATA

;uss normal uill become conditions, badly. Fusingstan,n-IQ scale, steredintelthebasisof is designed dard deviahas a stan:ores85 has ge of these :rcentageof lardscoreis Chapter 12 rs),the perry standard +1.00stanLuse507oof rroximately d+1.00SD.

For this reason,IQ scores,and many other fypes of scores,are often reported using standardscoresinstead of the original unit of measurement. Another use of standard scoreshas to do with the fact that different people sometimes do not use the same measurementscale in the same way. For example, supposethree trained judges use a 10point scale,ranging from poor (1) to excellent(10), to evaluate all 10 public speechesof the same type given at a forensics tournament (a,competitive tournament for undergraduateswhere various types of public speechesare performed and evaluated). As Figure 11.5 shows,the threejudges tend to use different portions of the same scale. Judge A tends to use the top portion of the scale (M = 7.3), judge B typically uses the middle portion (M = 4.7), arrd judge C uses the lower portion (M = 3.0). These public speeches,thus, appear to have been evaluated very differently by these three judges, even though they used the same 10-point scale.But notice how converting theseratings to z-scoresreveals someimportant similarities.For example,speech7, which received a rating of 7 from judge A and a 4 from judge B havejust about the samee-scorefrom bothjudges (-.21 versus-.22,respecrively).So this speechis actually evaluated pretty much the same way by thesetwo judges when standardscoresare

I U D C EB

I U D G EA

Speech 1 2 J

4 5 6 7 B 9 10 Mean SD

Rating 5 6 7 B 9 10 7 6 8 7

7.3 1.42

z-score -1.62 -.92

Rating

11

-.92 +.49 -.21

J

5 7

+.33 -.78 +.33 + 1. 4 4 a 1

i

2 B J

4.4 1.80

Rating 1

11

+

1'l

+.49 +1.20 +1.90

aa

J

used. And these standardscorescan also help to differentiate the speechesfrom one another.For example, suppose the forensics tournament organrzers want to honor the highest rated speech of this type with a "Top SpeechAward." If only mean ratings acrossthe three judges are used, speeches6 and 9 are tied as the best speech(M = 7.33). But when the ratings are converled to z-scores,and a mean z-scoreis calculatedacrossthe threejudges' z-scores,speech6 (meanz-score= +1.54) is superior to speech9 (meanz-score= +1.48) and should receive the award. Thus, even though the same (valid and reliable) measurement scale may be used,relying on raw scoresmay not tell the whole story aboutthe data.For this reason,someresearchers first convertraw scoresinto standardscores,or eventransformed standard scores(suchasmultiplying eachz-scoreby 100to eliminatethe decimal point), and then perform subsequentanalyseson the standard scores.Data kept in their original unit of measurement and not converted to standard scores are referred to as raw or unstandardized scores. Standard scores also allow comparisons between scores on different types of scalesfor the same individual or for different people. Here's an example ofthe first. Say you presenta speechin a

J U D G EC

z-score

-

I .JJ

+2.00 -.78

303

2 4 tr z

1 t)

3

3.0 1.55

ACROSS IUDGES

z-score

Mean Rating

Mean z-score

-1.29 0.00 -.65 +.65 0.00 + 1. 2 9 -.65 -1.29 +1.94 0.00

3.00 4.33 4.67 5.00 5.67 7.33 4.33 3.00 7.33 4.33

-1.23 -.38 _.18 +.12 +.51 + 1. 5 4 -.36 - 1. 1 8 + 1. 4 8 -.33

304

PART FoUR

ANALYZING AND INTERPRETINGQUANTITATIVE DATA

public speaking(PS) class and write a paper in an organizationalcommunication (ORG) class.Each professor uses a 100-point scale to evaluate the work, althoughthe criteria usedto awardpoints are probably very different for a speechand a paper, so these are two different scales.Furthermore, both professors intend to curve the grades given on the basis of all the scores obtained (e.g., the highest scorebecomesthe upper rangefor a gradeof 'A"). In the PS class,you score 85 on the speechand in the ORG classyou score 80 on the paper.In which classdid you do better? Infuitively, you might say the PS classbecausethe score of 85 is higher than the score of 80 in the ORG class,and both are out of a possible 100 points. And you might be right. But what if the scoreof 85 was actually the lowest scorein the PS class,while the scoreof 80 was the highestin the ORG class?In that case,you clearly did much better in the ORG class, and this would be apparent if these scores were converted to 1scores.The scoreof 80 in the ORG classwill have a positive z-score,whereas the score of 85 in the ' PS classwill have a negativez-score.And because the gradesin each class are curved to the distribution of scoresin that particular class,your grades will reflect this important difference. Finally, people say you can't compare apples and oranges (although this isn't the best analogy since both are fruits), but standardscoresactually allow comparisons between different people's scores on different types of measurements.Suppose a sportsagenthas two clients, a major league starting baseball pitcher and an NFL starting football quarterback, and wants to determine who is more valuable, so that the agent can put the most energy into making the best deal for the best client. Now sports enthusiastsprobably would agree that an important "stat" for a baseball pitcher is the number of strikeouts recorded per game, while the number of completions per game is an important "stat" for football quarterbacks.The agent, however,can't compare thesetwo "stats" directly. After all, the most amount of strikeoutsa pitcher could harre in a 9-inning game (assuming the game doesn't go into extra innings) is 27 (9 innings X 3 outs per inning). But the football quarterbackcan

potentially have a lot more than 27 completions in a game (e.g., by throwing short passeson virtually every play). So thesetwo statisticsare like apples and oranges;you can't comparethem directly. The agent, however, can compare these two clients' statistics by converting their number of strikeoutsand completionsto standardscores.That is, the agent calculates the mean and standarddeviation of strikeouts per game for starting baseball pitchers and does the same for completions per game for starting football quarterbacks (assuming all the necessaryinformation for doing so is available). The agentthen calculatesa z-scorefor each athlete (see Figure 11.6). The agent can now directly compare the two clients and see that the football quarterback is a better quarterback (zscore = -I.29) than the baseball pitcher is a pitcher (z-score= -2.00), although neither is doing particularly well (both have high negative z-scores).It might be time for the agent to get some new clients ! We just used a kind of "apples and oranges" case(two types of sportsstatistics)to illustrate the

BASEBATIPITCHER 1. Mean(M) numberof strikeouts per gamefor all pitchers= 6.0 2. Standarddeviation(SD)of strikeoutsper game f o r a l l p i t c h e r=s 1 . 5 3. Thispitcher's strikeouts pergame= 3.0 4. Subtractspecificpitcher'sstrikeoutsper game from the mean= -3.0 5. Dividestep4 by the SD = -2.00 FOOTBAU-QUARTERBACK 1. Mean(M) numberof completions per gamefor a l l q u a r t e r b a c=k s1 5 . 0 2. Standard deviation(SD)of completions per = 6.2 gamefor all quarterbacks 3. Thisquarterback's completions per game= 7.0 4. Subtract specificquarterback's completions/ gamefromthe mean= -8.0 5. Dividestep4 bv the SD = -1 .29

CHAPTERll

npletionsin a on virrually e like apples directly. re these two r number of scores.That andard deviihg baseball rpletions per s (assuming g so is avail:ore for each nt can now see that the rterback (z:r is a pitcher loing partic:-scores). It new clients! nd oranges" illustrate the

gamefor all s per Same 3.0 perSame

er gamefor onsper game= 7.0 oletions/

point, but the principle applies to compating most types of measurements,even if they are very different, such as comparing a baseball pitcher's strikeouts with the quality of a person's public speeches.But two cautionsa"rewarranted.First, in making such comparisons,researchersand practitioners (like sports agents) have to bd confident that the statisticsbeing used addresssome important, common "dimension," that they ar" on"s on' which a comparisonought to be based.Strikeouts ''by a quarterback by a pitcher and completions seempretty important and evencomparablein that they both addressthrowing a ball in their respective sport.Using other statistics,suchas number of strikeouts and number of fumbles, may not be a meaningful way to compare these athletes. And when very different comparisons are made, this caution is evenmore important. We're not sure,for example, what the basis would be for comparing strikeouts by a pitcher with the quality of a person's public speeches.Second, comparisons for the purposesofevaluating complex processesand outcomes, such as job performance, should be made on the basis of more than one variable. To compa.rethese two athletes more fully, many variables besides strikeouts and completions would need to be taken into account. In closing, converting scores on a measurement scaleto standardscoresthat measurehow far away they are from the mean of their distribution is an important tool for describing data. We will return later to the principle of seeing how many standard deviations data are from a mean, for it is the cornerstone of the inferential statistics covered in the next three chapters. DESCRIBING DATA THROUGH VISUAL DISPLAYS We said earlier that measures of central tendency and dispersion are used to summarize relatively large data setsbecauseit's too difficult to verbally shareand processall the data. It's also often helpful to describedata setsthrough visual means.Remember from the examples that opened this chapter how the colleague at work used charts and

DESCRIBINGQUANTITATTVEDATA

305

fi.guresto show sales over the past few months. And notice how we haveusedvisual figures in this chapterto help explain summary statistics. Researchers,educators,and other professionals alike often use tables,figures, graphs,and other forms to visually display distributions of data. These visual displays can highlight important information in the data and/or make complex data more understandable.Wallgren, Wallgren, Persson, Jomer, and Haaland (1996) argue that "good chartsare information" (p. 3) that help people see both main featuresof the data (the forest), as well as specific details (the trees).While perhapsoverstatingthe claim, "a good visual display saysmore than a thousandwords." Of course,the reverseis also true in that "bad charts convey disinformation" (Wallgrenet al., 7996,p.6). Constructinggood visual displaysof datais as much an art asit is a science.To constructsuchdisplays and realize their benefits, one must know what types of visual displays are possible,how to constructthem, and when to use them. There are a wide variety of visual forms that can be used to display quantitative data, although most arevariationson a few basic principles. Here, we examinesix common types usedby researchers and others to visually display quantitative data, and provide examples of each from communication researchstudies:frequency tables,pie charts, bar charts,line graphs,frequency histograms,and frequencypolygons.As you might suspectby now, the choice of which visual display to use is influencedby the type ofdata one seeksto portray. The first four visual displays are particularly useful for showing differences between the groups (categories) that comprise a nominal independent variable, while frequency histograms and frequency polygons are useful for showing relationshipsbetween an ordered independentvariable and other variables. Frequency Tables In one way or another,most visual displays are attempts to describe frequency distributions' a tally of the number of times particular values on a

306

PART FoUR

ANALYZING AND INTERPRETINGQUANTITATIVE DATA

measurementscale occur in a data set. The most basic way is to visually display the absolute frequency of the data obtained,the actual number of times each category or measurementpoint occurs in a distribution, in a frequency table, a visual display in table form of the frequency of each category or measurementpoint on a scale for a dis'tribution of data. Another way is to visually display the relative frequency of the data, the proportion of times each category or measurement point occurs (found by dividing the number of cases for a particular category or measurement point by the total number of cases for the entire distribution). Some displays even include the cumulatiye frequency (CF), the total frequencyup to and including any particular calegoryor measurementpoint. We've already seena coflrmunication research example of a frequency table in Figure 11.1, but let's examine a few more. Hirokawa and Keyton (1995) were interestedin the factors that members of organizational work teams believe facilitate and inhibit effective group performance. They studied groupsof 4-8 memberscomposedof schoolprincipals, teachers,and school resourcepersonnel(e.g., nurses, counselors,and psychologists) from nine different schools.The groups had been designated as studeni assistancegroups within an ongoing drug and alcohol abuse prevention program implemented by the school system. As preparation for their participation in the team, all group members had to attend an intensive two-day workshop conducted by health care experts. Each group was responsiblefor an on-site, student-directeddrug and alcohol abuse prevention program, and each goup's primary task during the workshop was to evaluate potential "at-risk" students, find a source of appropriate assistancefor them, and follow up on the effectiveness of the recommended assistance. To identify the factors members believed affected their group's task progress,members were asked two open-endedquestionsabout what aspect(s)of their group helped (facilitative factors) and hindered (inhibitive factors) their group's progress. A content-analyticschemewas developedandusedto code the answers(seeChapter9). In presentingthe results, the researchersconsffucted two frequency

tables, one that summarizedthe facilitative factors, and one for the inhibitive factors (seeFigure 11.7). Frequency tables are used quite often in this way to describethe frequencyofoccurrence ofcategories,such as the number of times the categories associated with facilitative and inhibitive factors are mentioned.As Figure 11.7 also illustrates,fre-

TABLE1: SUMMARYOF FACILITATIVE FACTORS Factor

Frequency

Organizationalassistance Compatible work schedules Information resources Interested/motivated group members Cood groupleadership Clearorganizational expectations No organizational interference

Percentage

9 1B

19 3B

10 15

21 32

9 5

19 11

7

15

TABLE2: SUMMARYOF INHIBITIVEFACTORS Factor

Frequency

lnsufficient time zz Informationresources 10 Procedural conflicts Poorgroupleadership 5 Uninterested/unmotivated 7 members No organizational 5 assistance Nofinancial 6 compensation Changingorganizational 4 expectations

Percentage

47 21 6 1l 15 11 13 9

Source: RandyY. Hirokawaand JoannKeyton,"Perceived Facilitators and Inhibitorsof Effectiveness in Organizational Work Teams," Management Communication Quarterly, B(4),pp.438-439,copyrighto 1995 by SagePublications, Inc. Reprintedby Permission of SagePublications, Inc. NOTE:N = 47.Percentages do not sum to 1007obecause participants could identifymuliiplefactors.

CHAPTER11 DESCRIBINGQUANTITATWEDATA

Ltivefactors, ligure11.7). tften in this enceofcate categories tive factors strates,fre-

,TORS Percentage 19 38 21 32 19 t l 1 q

ris Percentage 47 21 6 11 15 11 13 9 r, "Perceived ganizational t Quarterly, rge Publica)ublications,

07obecause

quency tables usually include both the absolute frequencyand relative frequency (e.g.,dividing the actual count of 9 for organizational assistanceby the 47 totalparticipants studied yields a percentage of 19). Frequency tables are very effective for highlighting which categoriesare used most ciften,and they can also be helpful for comparing at a glance the frequency of different categories for two or more groups. For example, as part of their study, Hirokawa and Keyton (1995) asked group members to rate (on a 3-point scale)the extent to which they had established five facets of data collection aad evaluation proceduresand three aspectsoftheir communication with external constituents. On the basis of the cumulative averagescoresin these two areas,groups that fell in the bottom third of the range were labeled "low-effective" while groups that fell in the upper third of the range were identified as 'high-effective." In discussing the differences between low- and high-effective groups on the facilitative and inhibitive factors identified previously, the researchersconstructed two additional frequencytables(seeFigure 11.8).Thesetablesallow one to eyeball differencesbetween these two groups. (The researchersacfually identify statistically significant differences between low- and higheffective groups in the table through the use of asterisksand in the accompanyingnote. We examine principles of statistical significance testing in Chapter 72 and particular statistical tests of differencesbetweengroups in Chapter 13.) Frequencytablesare also helpful for showing changesover short or long periods oftime. Sapolsky (1982), for example, conducted a two-yea.r (1978-1979) content analysis of the frequency of sexualincidents on prime-time network television shows.The findings, reportedin a frequencytable, showed that, when gender of the initiator and marital status of partners were combined, the total number of noncriminal sex acts increased from l4l in 1978 to 247 it 1979, with the largest increaseinvolving unmarried partners(22 in 1978ro I38 in 1979).To illustrate long-term change,Bogart (1985) studiedchangesin the contentofUnited Statesnewspapersover a2}-year period, compar-

307

ing the percentageof newspaperscarrying particular types offeatures (70 topics in all) at least once a week in 1963,1974, 1979,and 1983.The findings, reported in a frequency table, showed that while categories such as health and medicine stayedrelatively the sameover the 20 years(68Vo, 71Vo, 66Vo, 63Vo, respectively), other categories such as science steadily decreased(34Vo, 24Vo, I4Vo,9Vo) and such new categoriesas "people" emerged(no instancesuntll507o in 1983). In both studies, the researchersused frequency tables to report some of the data. The next step, of course, would be to make senseof the changesfound. For example,what might the steadydecline in science articles and the sudden emergenceof "people" articles say about our culture? Frequency tables are an important way for researchersto visually display data. They are, however, fairly simple drawings, in that they do not rely on such graphicsas shadedareasor connected lines. The next five visual displays do. Pie Charts One way to visually illustrate the frequency of categoriesusing shadedareasis through pie charts, which are circles divided into segmentsproportional to the percentageofthe circle that represents the frequency count for each category. Pie charts are a favorite way of visually presenting information in the popular media, such as on television and in newspapers and magazines (USA Today seems to have a pie chart virtually every day). They are also frequently employed in business presentations, and even some politicians use them. Ross Perot, for example, used pie charts extensively in his television infomercials during the 1992 presidential campaign,and they becamea standardfeature of comedic portrayals of him. Although pie charts are used infrequently in scholarly researchreports, they have been used to provide descriptive information about a research data set.Allman (1998), for example,studiedphysicians' self-disclosure about medical mistakes. Thirty-nine internists and family medicine physicians completed a questionnaireabout a medical

308

PART FOUR

ANALYZING AND INTERPRETINGQUANTITATIVE DATA

TABLE3: FACIIITATIVE FACTORS FORLOW- AND HIGH-EFFECTIVENESS CROUPS

Low-Effective Factor Organizationalassistance Compatible work schedu les lnformationresources Motivatedmembers Cood leadership Clearexpectations No interference

High-Effective

Frequency Percent 1 z

10 20

0 20

z

01 0

Frequency

Percent

B 15 8

31 5B* 31** 50*

I J

3 1x * 15 12

B 4

10

5

(n = 10)

(n = 26)

CROUPS TABLE 4: INHIBITM FACTORS FORtOW- AND HIGH-EFFECTIVE Low-Effective Factor

Timeavailability lnformationresources Procedural conflicts Poorleadership Uninterested members No organizationalassistance No financialcompensation Changingexpectations

High-Effective

Frequency Percent 9 7 5

5 5 5 5 4

90 70 30 50 50 50 50 40 (n= 10)

Frequency

Percent

11 B***

z

0 0 B**

1

0 1 0 (n = 26)

and Inhibitorsof EfSource: RandyY. Hirokawaand JoannKeyton,"PerceivedFacilitators fectiveness in OrganizationalWork Teams,"ManagementCommunication Quarteily,8(4), Inc. Reprintedby Permission pp. 439-44O,copyright@ 1995 by SagePublications, of Sage Publications, Inc. could identifymultiplefactors. NOTE:Percentages do not sumto 1007"becauseparticipants *denotesdifferencein proportionthat is significantat the .05 level;*+denotes differencein proportionthat is significantat the .01 level;x**denotesdifferencein proportionthat is significantat the .001Ievel.

error they had made at some point after completing their residency.One of the questionsaskedthem to indicate to whom they disclosedthe medical error. Allman created a pie chart to visually show the recipients of this self-disclosure (see Figure 11.9). As the pie chart shows,physiciansdiscussedtheir error primarily with anotherphysician (36Vo),followed by a significant other (23Vo),and then the patient or patient's famlly (2l%o). Seldom were such mistakes disclosed to other medical person-

nel (I2Vo), risk managementor quality assurance personnel (4Vo), personal attorney or counselor (3Vo),or a nonmedical fiend (I7o). Pie charts can also be very helpful for visually showing differences between groups. A. Rodnguez (1996), for example, analyzed the production of Noticiero Univision, the nightly national newscastofthe largestSpanishlanguagetelevision network in the United States.Part of this study included an extensive comparative content analysis

CHAPTER11 DESCRIBINGQUANTITATTVEDATA

r B 6 r m r B

AnotherPhys SignifOther Patientor Family Oth Med Personnel RiskMgtor QA Atty pr Counselor NonmedFriend

Source: Allman,"Bearing theBurden or Baring the Joyce Soul:Physicians Self-Disclosure andBoundary ManagementRegarding MedicalMistakes," HealthCommunica.l tion, I0(2),p. 185, copyrightO 998 by Lawrence ErlbaumAssociates, Inc. Reprinted by permission of Lawrence Erlbaum Associates, Inc.

r-yassurance rr counselor

of the storiespresentedby Noticiero Univision and ABC's World News Tonight with Peter Jennings. Two pie charts (see Figure 11.10) illustrate "with broad strokes the most obvious differences between these two national newscasts"(Rodriguez, 1996,p. 66). Clearly,Noticiero Univision presents a social world, that of Latino communities in the United States and Latin America, rarely seen on mainstream commercial television. But Rodriguez's naturalistic (ethnographic) research also showed that Noticiero Univision embracesmany of the same structural features as mainstream telecasts,such as 'Journalistic objectivity." Rodriguez concluded that "in its simultaneous embrace of journalistic objectivity and U.S. Latino panethnic identity, this newscastis a resourcefor Latinos in their acculturationto U.S. society" (p.16). Bar Charts

for visually s. A. Rodthe productly national ;e television fs study inent analysis

Bar charts are a common and relatively simple visual depiction that use shaded"blocks" (rectangles and/or squares)to show the frequency counts for the groups that comprise a nominal independent variable. Bar charts typically are drawn using vertical blocks; that is, the independent variable groups are presentedonthe abscissa,the horizon-

(45.3/4 LatinAmerica

309

(8.3%)International

(3.5%)

Miscellaneous

(16.8%) Washington D.C.

(14.5%) U.S. Latino

ABCTOPICS (18.0%) International

(36.2y") Washington D.C.

(1.7"/.) Miscellaneous

Source:AmericaRodriguez, "Objectivity andEthnicity in theProduction of theNotlcleroUnivision," CriticalStudies in MassCommunication, I3(1),p.67. Copyright by the .l National Communication Association. 996.Reprinted by permission of thepublisher.

tal (or x) axis (a line used to construct a graph) (which is why the independentvariable is sometimes called the X variable), and the frequency of occurence on the dependentvariable (Y variable) is presentedonthe ordinate,the vertical (ory) axis, so that the blocks rise vertically up the page. (The y axis in a three-dimensionalvisual display referencesthe vertical height axis; the z axis references the vertical depth axis.) Horizontal bar charts, with blocks running horizontally from left to right, sometimesare used when the categorieshave long names or when there are many categories(say 10 or 20) (see Wallgren et al., 7996). Bar charts are especially helpful for visually depicting differences between groups on a variable.

310

PART FoUR

ANALYZING AND INTERPRETINGQUANTITATIVE DATA

For example, Lang, Geiger, Strickwerda, and Sumner (1993) testeddifferencesbetween related and unrelatedcuts on televisionviewers' attention,processing capacity, and memory for the information contained in television messages. Participants watched a videotape that contained 12 segments, separatedby black, ofregularly occurring network television shows. Six segmentscontained related cuts, those occurring between scenesthat were related either by visual or audio information (e.g., a cut from one camera to another in the iame visual scene),and six segmentscontainedunrelatedcuts, those occurring between scenesthat were completely unrelatedto one another(e.g., a cut from a scene in one program to a scene in a different program). The percentageof correct scores to a multiple-choice instrument was used to measure participants' information recall. The researchers used a bar chart to visually show the information recall difference between participants in the two conditions on this variable (seeFigure 11.11). As you can see,and as predicted,participantsremembered more information surroundins related than unrelatedcuts. A very common use of bar chartsis to visually show differences between the groups formed from the combination of two or more independentvariables,and theseare called grouped or segmented bar charts. As Wallgren et al. (1996) explain, "The different categories are representedby different bars in the same chart with common axes. In order to distinguish the categorieswe use different patterns of shading and a legend" (p.26). For example, Le Poire (1994) studied differencesin how people respond to gay males and persons with AIDS (PWAs). In one study, 95 participantsinteracted with a male confederate.They were told to assumethat they would like to get to know this person better, and a22-item self-disclosuresheetwas usedto guide the interactions.The confederatewas trained to answer all questionsidentically, and did so, exceptfor one question,which read "My greatestfear is...." For this question,the confederaterespondedin one of three ways: (a) "not getting a job after graduation" (nonstigmatized condition), (b)'that my family will find out I am gay" (gay

70 o o

o ^ _ - o c o C)

o (L

60

Related

Unrelated

Source:AnnieLang,SethCeiger,MelodyStrickwerda, andJanine Sumner, of Related "TheEffects andUnrelateo Cutson Television Attention, Viewers' Processing Capacity,andMemory," Communication Research, 20(4),p.21, copyright Inc.Usedby PerO 1993by SagePublications, mission of Sage Publications, Inc.

stigmatizationcondition), or (c) "that the HfV test I just took may be positive" (PWA stigmatization condition). Afterwards, participantswere askedto rate their desire for future interaction with the confederateon a four-item, 7-point semanticdifferential; the four items were combined and a mear score computed, with a higher score indicating more desire. Because Le Poire suspected that participants' gender made a difference, she also aralyzed differences between male and female participants' desire for future interaction within each confederatecondition. She displayed those results in a grouped bar chart that uses different shadingfor males and females (explainedin a leg-

CHAPTER11 DESCRIBINGQUANTITATTVEDATA

end) in the three conditions (seeFigure 11.12).As you see,and as statisticalanalysesrevealed,there is a very different pattern for females and males in the three conditions. While females desired the most amount of fufure interaction with the PWA, followed by the gay male, and then by the nonstigmatized male, males desired the most amount of futnre interaction with the nonstigmatized male, followed by the PWA, followed by the gay male. How would you interpret thesefindings? Line Graphs A final type of visual display particularly useful for showing differencesbetweenindependentvariable groups is line graphs, which use a single

58.93

Attraction toward Gays and PWAs Desirefor FutureInteraction related

Legend 5.8

Strickwerda, rd Unrelated ssingCapac2 0 ( 4 )p, . 2 1 , Usedby Per-

@Males

!Females

5.6 5.4

-

i

3 5.2 x: 5 o

E 4.8 re HfV test rnatization re askedto ith the conic differennd a mean indicating rccted that e, she also urd female iion within ayed those :s different ed in a leg-

L ^ .= z+.o a

o a 4.4 4.2 4 3.8 Nonstigma Gay PWA StigmaCondition Source:BethA. Le Poire,'Attractiontowardand Nonverbal Stigmatization of Cay Males and PersonsWith AIDS:Evidence Attitudiof SymbolicoverInstrumental nal Structures,"Human Communication Research, 21(2),p.254, copyrighta 1994by SagePublications, Inc.Reprinted Inc. by Permission of SagePublications,

311

point to representthe frequency count on a dependent variable for each ofthe groups and then connect thesepoints with a single line. The simplesttype of line graph comparestwo groups with respect to a dependentvariable. For example, Weiss, Imrich, and Wilson (1993), assessedthe impact of two desensitizationexposure strategieson children's emotional reactions to a frightening movie scene.Young boys and girls either were exposedto a live earthworm demonstration (live exposurecondition) or received no such exposure(no live exposurecondition). In the live exposruecondition, the boys and girls watched an experimenterreachinto a bowl of worms, pick one up, and hold it for all to see and, thereby, show them that they weren't harmful. A1l the children then viewed a frightening scenefrom the PG-rated movie Squirrn, in which a man and woman are fishing in a boat, a container of worms trps over and they spill ou| and asthe man moves toward the woman, he falls over and the worms begin to attack his face. The boat capsizes,and the scene ends with the man running out of the water. Immediately after viewing this segment,the children were askedhow scaredthey felt while watching it, using a scale that ranged from 0 (not scaredat all) to 4 (very very scared). The percentageof boys and girls expressing fear in the two conditions was shown in a line graph (seeFigure 11.13).As you see, and as statistical analysesrevealed, the percentage of boys expressing fear after seeing the live worms (347o)was much less than the percentage of boys who had not seen the live worms (59Eo).Such exposure,however,had no impact on girls, as 70Vowhohadlive exposureexpressedfear comparedto 60Vowho did not have suchexposure. The researchersposit that the desensitizationprocedure could not overcome girls' general dislike for insects,reptiles, and snakes. Line.graphs are particularly useful for showing changesin groups over two or more points in time. For example, Mares and Cantor (1992) examinedthe effectsof two types of portrayalsof old age on the emotional responsesof aging viewers. Two hundred and fifty aging participants (M = 75.1, SD = 5.0) were measuredwith reqardto their

3t2

PART FOUR

ANALYZING AND INTERPRETINGQUANTITATIVE DATA

b 8 0 c)

LL O)

'6 q

70

o x

ta)u o 6so

Boys # ---Q--Girls

o

sc 4 0 o L 3 0

No LiveExposure LiveExposure Source:Audrey J. J.Weiss,Dorothy J.lmrich,andBarbara Wilson."PriorExoosure to Creatures Froma HorrorFilm: LiveVersus Photographic HumanComRepresentations," munication Research, 20(1),p.55, copyright O 1993by of Sage SagePublications, Inc. Reprinted by Permission Publications, Inc.

degree of loneliness. Those falling in the top or bottom 20Vo of the scores (designatedas lonely and nonlonely, respectively) were shown one of two 9-minute videotapes. One tape portrayed an aging man as isolated (e.g., his wife had just died and he describedhis loneliness),whereasthe other tape portrayed the man as integrated (e.g., he was successfuland happy,living with his wife, and had many friends). Both before and after viewing the stimulus tape, participants' degree of affect was measured through the Multiple Adjective Affect Checklist (MAACL), which asks people to say how they feel by selecting from positive words (e.g., agreeable) and negative words (e.g., unhappy); the score ranges from 0 (maximum positive mood) to 43 (maximum negativemood). The researchers used a line graph to show the differencesamong the four groups from pre-viewing to post-viewing (see Figure lI.l4). As you see, lonely people had more negative affect than non-

i

lonely people prior to viewing the tape, but after watching the tape, the nonlonely/isolated group increasedtheir negative affect substantially (and statistically significantly), whereasthe lonely/isolated group actually decreased (statistically significantly) their negative affect. The finding that lonely people feel better after watching a negative program is actually predicted by Social Comparison Theory in that people are assumedto feel better after seeing another person's misery. Most important, the findings suggestthat emotional responses to media content are complex and related to viewers' prior emotional states. Frequency Histograms and Frequency Polygons Pie chans, bar charts,and line graphsare very useful for visually showing differencesbetweennom-

Lonely/ Integrated

25 U'

Nonlonely/ lsolated*

Ezo 3 rs O gro

Lonely/ lsolated* Nonlonely/ Integrated Pre-viewing

Post-viewing

Source:Marie-LouiseMaresand JoanneCantor,"Elderly Viewers' Responses to TelevisedPortrayals of Old Age: Empathyand Mood ManagementVersusSocialComparison," CommunicationResearch, 19(4),p.473, copyright @ 1992by SagePublications Inc.Reprinted by Permission of SagePublications Inc. NOTE:The higherthe MAACLscore,the greaterthe negative affect. *p < .05.

CHAPTERll

e, but after d group iniy (and staely/isolated liy signifilthat lonely gative pro)omparison elbetter aflost imporI responses ed to view-

rc very useween nom-

-onely/ ntegrated ,lonlonely/ solated* -onely/ solated* 'lonlonely/ ntegrated rWlfig

rtor, "Elderly of Old Age: :ialCompari23,copyright ryPermission Lterthe nega-

inal independent variable groups with regard to a dependentvariable. Whqn an independent vgriable is measuredusing an interval or ratio scale,tables could be constructedto show the frequencycounts for each of the measurement points, but these would sometimesbe so big (e.g., a table with 100 rows to show percentages)that they would be overwhelming and confusing. So instead, researchers typically illustrate these frequency distributions visually. Two procedurysused to show such frequency distributions are fidquency histograms and frequency polygons. Frequency histograms, like bar graphs, use blocks to show how frequently each point on an intervaVratio scale occurs. However, because an intervaVratio scale has equal distances between the points on the scale,the blocks touch. For example, mass communication scholars are interestedin identifying factors that influence "looks," those periods where people watching television look at and then away from the screen.J. J. Burns and Anderson (1993) studied looks by applying the theory of attentional inertia, which argues that "if a medium of information has held a person's attention for a period of time, a generalized tendency develops to sustain attention to that medium" (p. 778). The researcherstestedhypotheses derived from the theory by videotaping male and female studentswatching an episodeof the television show Magnum P 1. followed by an episode of Cagney and Lacy. The videotapeswere coded for looks using a computer-assistedcoding system. Participantswere then tested on their recognition of 232 separute items shown in the videotapes. To see the shape of the data, the researchersconstructed a frequency histogram of look lengths on the basis of 1-second intervals, and provided a typical example for a single female viewer (see Figure 11.15). Running along the x-axis is the look length in seconds;running along the y-axis is the frequency for the 2OOl232 items where the looks were separatedby pauses of 3 seconds' duration or less. You can see that the scores are skewed to the left (see Chapter 12 for a discussionofdistribution skewness),a shape

DESCRIBINGQUANTITATIVEDATA

313

actually predicted by one of the hypothesesderived from the theory. Frequency polygons are similar to line graphs,except that a line connectsthe points representing the frequency counts for each point on the interval/ratio scale used to measurethe independent variable (rather than each group of a nominal independentvariable, as in line graphs). For example, the Lang et al. (1993) study described earlier talked about the effects ofrelated and unrelated TV cuts on viewers' attention, processing capacity,and memory. The researcherspredicted that both related and unrelated cuts would elicit orienting responses,an involuntary automatic responseelicited by changesin the environment.To test this hypothesis, they attached electrodes to participants' forearms while they watched the television sesmentsto collect heart rate data. The

a 120 o 5

0)

LE 80

5

10 't5 20 25 LookLengthin Seconds

30

Source:JohnJ. Burnsand Daniel R. Anderson,"AttentionalInertiaand Recognition Memoryin AdultTelevision Viewing,"CommunicationResearch, 20(6),p.792, copyright O 1993 by SagePublications, Inc. Reprintedby Permissionof SagePublications, rnc.

314

PART FOUR

ANALYZING AND INTERPRETINGQUANTITATIVE DATA

84 E *_ 82 (U o I

80

0.0 1.0 2.0 3.0 4.o 5.0 6.0 7.0 8.0 9.0 10.0 11.0 12.0 13.0 14.0 Halfseconds and JanineSumner,"The Effectsof Related Source:Annie Lang,SethCeiger,MelodyStrickwerda, and UnrelatedCutson Television Viewers'Attention,Processing Capacity,and Memory,"CommunicationResearch, Inc.Adaptedby Permission 20(1),p.1 9, copyright@ 1993by SagePublications, of SagePublications, Inc.

frequency polygon, with half seconds on the raxis and heart rate on the y-axis (see Figure 11.16), and statisticalanalysesprovided support for the hypothesis, as there was a decelerationof heart rate for the first 4 seconds(8 half seconds) after a cut, followed by an acceleration. CONCLUSION The statisticsdiscussedin this chapterexplain various ways in which researchersattemptto describe the data they have acquired. The first step in analyzing a set of data is to understand its characteristics. Once important characteristics have been determined,it is possibleto go beyond description

to infer conclusionsabout the data. Go back to the examples that started this chapter and you will see that the final three attempt to infer conclusions from the relevant data. The television newscaster concluded from the poll results that the election was all but over. The newspaperclaimed, on the basis of percentages,that there exists an "important" difference between men and women in the standard of living in the first year after divorce. And in the final example, the television program concluded, on the basis of a statistical analysis, that there is a "substantial" relationship between eating a particular food and heart disease.To find out how researchersand others draw such conclusions,we now examine inferential statistics.