The Validity and Reliability of Pain Measures in Adults With Cancer. Mark P. Jensen

C RITICAL R EVIEW The Validity and Reliability of Pain Measures in Adults With Cancer Mark P. Jensen Abstract: To be most useful, clinical trials of c...
Author: Gregory Reeves
13 downloads 2 Views 142KB Size
C RITICAL R EVIEW The Validity and Reliability of Pain Measures in Adults With Cancer Mark P. Jensen Abstract: To be most useful, clinical trials of cancer pain treatments should use pain measures that are both reliable and valid. A great variety of measures are now available that may be used to assess cancer pain. However, there are not yet any clear guidelines for selecting one or more measures over the others. The purpose of this article is to summarize the evidence concerning the validity and reliability of cancer pain measures. One hundred sixty-four articles were identified that provided psychometric data of pain measures among patients with cancer. The results indicate that commonly used single-item ratings of pain intensity are all valid and adequately reliable as measures of pain intensity, although some scales appear to be easier for patients with cancer to understand and to use than others. Multiple-item measures of pain intensity are reliable, but evidence concerning their validity is lacking. There is a paucity of research examining the psychometric properties of measures of cancer pain interference, pain relief, pain site, the temporal aspects of pain, and pain quality. This lack of evidence limits the conclusions that may be drawn concerning the reliability and validity of these other pain measures. Composite measures that combine ratings of pain intensity and pain interference into a single score appear to be both valid and reliable for describing patient populations, although their usefulness in clinical trials may be limited because they can obscure the contributions of intensity and interference to the total score. Proxy measures of cancer pain (pain ratings made by someone other than the patient) may be useful when patients are not able to provide pain ratings, but they should not be used as replacements for patient ratings when patient self-report measures are available. The discussion includes specific recommendations for selecting from among the available pain measures, as well as recommendations for future research into the assessment of cancer pain. © 2003 by the American Pain Society Key words: Cancer, cancer pain, pain assessment.

C

linical trials of cancer pain treatment are essential for identifying and estimating the effectiveness of interventions that provide cancer pain relief. For the results of such trials to be deemed valid, the pain measures used must have proven reliability and validity. However, despite the existence and use of many different pain measures, there has not yet been a review and synthesis of empirical findings regarding the reliability

Received March 11, 2002; Revised June 20, 2002; Accepted June 20, 2002 From the Department of Rehabilitation Medicine, University of Washington School of Medicine, and Multidisciplinary Pain Center, University of Washington Medical Center, Seattle, WA. Supported by funding from the American Pain Society’s Clinical Practice Guideline Program. Address reprint requests to M. P. Jensen, PhD, Department of Rehabilitation Medicine, Box 356490, University of Washington School of Medicine, Seattle, WA 98195-6490. E-mail: [email protected] © 2003 by the American Pain Society 1526-5900/2003 $30.00 ⫹ 0 doi:10.1054/jpai.2003.1

2

and validity of the measures commonly used in cancer pain research. The primary purpose of this article is to perform such a review to provide clinicians and researchers with data that may be used for selecting pain measures. Following a brief introduction to the concepts of validity and reliability, the methods and results of the review are presented. The discussion summarizes the findings from the review, presents specific recommendations concerning the selection and use of cancer pain measures, and suggests directions for future research that will clarify the psychometric properties of these measures.

Validity Validity refers to the appropriateness, meaningfulness, and usefulness of a measure for a specific purpose. Validity is generally seen as the most important consideration in the evaluation of a measure.2 With respect to

The Journal of Pain, Vol 4, No 1 (February), 2003: pp 2-21

CRITICAL REVIEW/Jensen cancer pain measures, validity refers to the extent to which the measure(s) under question are valid and useful indicants of cancer pain or are useful predictors of important outcomes such as survival or quality of life. Although several types of validity can be considered, the most common types examined are content, construct, and criterion validity.2 A measure’s utility can also be considered an indication of that measure’s usefulness or validity. Content validity concerns the degree to which the items of a measure are representative of some defined universe or domain of interest and is usually determined by the use of expert judgments. Construct validity refers to the extent to which a measure assesses the specific domain or construct of interest. Evidence supporting the construct validity of a measure or assessment protocol comes from a variety of sources rather than a single source or study. Criterion validity refers to a measure’s associations with one or more outcome criteria. Primary importance in pain assessment concerns the criterion of sensitivity to the effects of treatment or to changes in pain over time, because pain measures are used primarily for these purposes. However, pain measures may also be used to monitor or to predict the course of a disease state, to test hypotheses concerning the impact of pain on other outcomes or measures of functioning, or to place patients into specific diagnostic groups. Criterion validity data provide evidence for or against the use of the pain measure for these purposes. A measure’s utility concerns its specific usefulness in particular settings and with particular populations. A measure may have a great deal of validity but may be too long or too difficult to administer, to understand, and to score to be of much practical use in clinical settings. Another measure, although useful for assessing pain in some situations or with some patient populations, may be difficult to administer in other settings or with other patient groups. Evidence concerning a measure’s relative utility, then, may be used to identify the specific situations, settings, and populations that are most appropriate for that measure.

Reliability Reliability refers to the extent to which a score is free from errors of measurement. Many factors in addition to a patient’s experience of pain could potentially influence his or her response to a pain measure or scale. Such factors might include the specific assessment setting (eg, laboratory versus clinic), the person administering the measure (eg, a research assistant, clinician, primary health care provider, or family member), other subjective experiences and feelings (eg, being more or less fatigued or upset), or even motivational factors (eg, desiring to appear stoic, desiring to communicate a need for analgesic medications). Some measures may be difficult for patients to understand, adding another potential source of measurement error if these measures are used with patients who have limited cognitive abilities. The variance associated with these other factors, which is not associated with the specific domain of interest, is consid-

3 ered error variance. The most common estimates of reliability in pain assessment research are reliability coefficients, which can be used to estimate a measure’s internal consistency (eg, coefficient alpha, which reflects the association among items in a scale) and stability over time (eg, test-retest stability coefficients).

The Purpose of This Review Although a great deal of research has been published on the psychometric properties of cancer pain measures, to date, a comprehensive review of this literature has not been published. The purpose of this article is to present such a review by (1) identifying studies that include data concerning the reliability and validity of cancer pain measures, (2) summarizing and synthesizing the findings concerning the psychometric properties of the measures used in these studies, (3) making recommendations for the use of these measures, and (4) making recommendations for future research concerning the validity and reliability of cancer pain measures.

Methods Article Identification Potential articles for this review were initially identified by performing a series of MEDLINE searches by using 6 key word combinations: cancer and pain and reliability; cancer and pain and validity; cancer and pain and quality of life; cancer and pain and assessment and 1999; cancer and pain and assessment and 2000; and cancer and pain and assessment and 2001. (The years 1999 through 2001 were added to the latter 3 searches because without these the search resulted in too large of an article selection to be practical). This was followed by a series of searches with key words that included specific measures that emerged in the initial searches (McGill Pain Questionnaire, MPQ, Visual Analogue Scale, Visual Analog Scale, VAS, Numerical Rating Scale, NRS, Verbal Rating Scale, VRS, FLIC, QLQ-C-30, QLQ-C33, QLQ-C36, SF-36; each combined with the key words cancer and pain). The titles and abstracts of the articles identified from these searches were read to select those that might contain psychometric data concerning cancer pain measures. These articles were then read, and those that met the inclusion criteria (listed below) were selected. The reference list of each selected article was also reviewed to identify additional potential articles that might meet the inclusion criteria. The search resulted in the identification and review of 273 potential articles. From these, 164 articles were selected that met the following 4 inclusion criteria: (1) some or all of the study subjects had cancer, (2) at least 1 measure of pain was included, (3) the pain measure was described adequately enough to be able to classify the specific type of scaling used and the dimension(s) of pain it assessed, and (4) data were presented that address the psychometric properties of the pain measure(s) included in the study. A list of the articles included in this review can be obtained from the author via e-mail ([email protected]), or they can be seen at www.jpain.org.

4

Summarizing of Study Findings The 164 articles were read, and the psychometric data were summarized into a table that included the following information: author(s) and date, pain measure(s) used/examined, study design, description of sample(s) and setting(s) (sample size, diagnosis/diagnoses of subjects, age of subjects, sex of subjects, setting[s] of the study, and sample selection procedures), findings concerning the measure’s(s’) validity, findings concerning the measure’s(s’) reliability (most often test-retest stability and/or internal consistency), findings concerning measure utility (eg, comments concerning specific difficulties with the measure(s)/frequency of subject inability to use measure[s]). The final column in the summary table listed conclusions that may be drawn from the study concerning the validity and/or reliability of the measure(s) studied, as well as any comments concerning the methodologic strengths or weaknesses of the study regarding the interpretation of the findings. A copy of the table that includes the coded data can be obtained from the author via e-mail ([email protected]), or it can be seen at www.jpain.org. Data describing the subjects (subject number, cancer diagnoses, sex, age, ethnicity), cancer pain dimensions and measures assessed, and type(s) of validity and reliability data presented in these studies were entered into a database for study descriptive analyses (see below).

Results Description of the Study Samples The total number of subjects with cancer reported in these studies was 36,128, and the average number of subjects per study was 220 (range, 6 to 1897). The types of cancer diagnoses held by the study subjects were not always specified; 11,401 or 32% of the subjects in these studies were reported to have cancer, but no specific cancer diagnosis or site was given. In those studies that did identify cancer diagnosis or site, the diagnoses most often carried by the study participants were breast (14%), lung (11%), prostate (10%), head and neck (7%), metastatic (6%), gynecologic (4%), myeloma (4%), gastrointestinal (2%), colon, rectal, or colorectal (2%), and genital (2%) cancer. Melanoma, oral, lymphoma, liver, bone, hematologic, brain, cervical, stomach, orofacial, esophageal, reproductive, soft tissue, bladder, nasopharyngeal, testicular, kidney, and thoracic (all less then 1%) cancer were represented less often by the study participants. Both sexes were represented in this body of research. Of the 24,979 subjects whose sex was specified, 13,349 (53%) were male and 11,630 (47%) were female. Age ranges varied, with the youngest subject in each study ranging from 12 to 64 years (average youngest age, 29.80 years) and oldest subject in each study ranging from 47 to 99 years (average oldest age, 81.56 years). Average mean age for the 122 studies that reported the mean age of their sample was 58.72 years. Ethnicity of the samples was reported in only 30 of the studies. When

Cancer Pain Assessment reported, the majority were reported to be white or Caucasian (average percent, 73.83%; range, 0% to 100% across studies), but other ethnic groups were often included. In the 21 studies that classified subjects by ethnic group with more specificity than just white/Caucasian versus other ethnicity, on average, 10.73% were reported to be black/African American (range, 0% to 41%), 15.85% Asian (range, 0% to 100%), 3.89% Hispanic (range, 0% to 24%), and 1.04% other ethnicity (range, 0% to 8%).

Dimensions Examined The psychometric properties of a number of different measures assessing a variety of pain-related dimensions were examined in these studies, including those that assessed pain intensity (examined in 74.4% of the studies), pain intensity and pain interference combined into a single composite score (examined in 29.3% of the studies), pain interference (14.0% of studies), pain relief (10.4% of studies), pain quality (10.4% of studies), affect/unpleasantness/bothersomeness (8.5% of studies), pain site (1.8% of studies), and temporal aspects of pain (1.2% of studies). The types of measures used to assess these dimensions also varied and included single rating scales (eg, Visual Analog Scales, Numerical Rating Scales, Verbal Rating Scales, Graphic Rating Scales, and Mechanical Visual Analog Scales) and composite measures (of pain intensity, pain interference, pain affect, pain quality, and multiple-dimension composite measures).

Results Concerning Validity and Reliability of Self-Report Measures All 3 major types of validity (content, construct, and criterion) were reported in these articles. Construct (61.0% of studies) and criterion (50.6%) were addressed much more often than content (2.4%) validity was. In the articles reviewed, construct validity was usually evaluated by examining the associations between the pain measures and measures of the same pain dimension or other pain-related constructs. Less often, and for multiple-item scales, construct validity was evaluated through factor analyses to show whether the items on a scale loaded together onto a single factor. Criterion validity was most often determined by (1) a measure’s sensitivity to changes in pain with treatment or (2) a measure’s ability to predict important outcomes, such as mortality or disease progression. Reliability of measures was examined less often than validity; only 26.8% of studies provided reliability data. It was most often presented as either an internal consistency coefficient for multiple-item scales (18.9% of studies) or as a test-retest stability coefficient (9.8% of studies).

Validity and Reliability of Single-Item Ratings of Cancer Pain Intensity Visual Analogue Scale of Cancer Pain Intensity A Visual Analogue Scale of pain intensity (VAS-I) consists of a line, usually 100 mm long, with each end of the

CRITICAL REVIEW/Jensen line labeled with descriptors representing the extremes of pain intensity (eg, no pain, extreme pain). Respondents place a mark on the line that represents his or her pain intensity level, and the distance measured from the “no pain” end to the mark is that person’s VAS pain score. The VAS-I has consistently demonstrated sensitivity to changes in cancer pain associated with treatment or time (eg, Stambaugh and Sarajian, 1981; Anderson et al, 1991; Sandouk et al, 1991; Moore et al, 1994; Ingham et al, 1996; Tannock et al, 1996; Talmi et al, 1997; Holland et al, 1998; Manfredi et al, 2000; Mercandante et al, 2000; Zeppetella, 2000) and usually shows strong associations with other pain intensity ratings (Kremer et al, 1981; Walsh and Leber, 1983; Ahles et al, 1984; Littman et al, 1985; Wilkie et al, 1990; Gaston-Johansson et al, 1992; Grossman et al, 1992; Soh and Hui-Gek, 1992; Paice and Cohen, 1997; Sze et al, 1998; Ramer et al, 1999; Chang et al, 2000; Klepstad et al, 2000). In only 2 studies did VAS-I correlate less than .70 with other pain intensity ratings (eg, r ⫽ .29 to .56 with 0-10 Numerical Ratings scales reported by Chang et al, 2000; r ⫽ .67 with a Verbal Rating Scale reported by Fishman et al, 1987). The VAS-I has also demonstrated criterion validity through its associations with performance status (Hollen, Gralla, Kris, and Cox, 1994; Chang et al, 2000), diagnosis (cancer vs non-cancer, Padilla et al, 1983), setting (inpatients vs outpatients, Chang et al, 2000), measures of psychological distress (Gaston-Johansson et al, 1992), and measures of global quality of life (Coates et al, 1983; but also see Hollen et al, 1994). VAS measures of pain intensity have been shown to be distinct from VAS measures of pain unpleasantness, supporting the discriminative validity of both (Price et al, 1987). Change in pain intensity, as measured by a change in the VAS-I score, has been associated with both change in tumor status (Coates et al, 1983) and with survival (Coates et al, 1992; see also Coates et al, 1993). Test-retest reliability of the VAS-I was examined in 4 studies, with time periods ranging from 5 minutes (r ⫽ .95; Grossman et al, 1992) to 1 week (r ⫽ .75; Chang et al, 2000). The average test-retest coefficient across these 4 studies (7 coefficients) was .80 (Padilla et al, 1983; Grossman et al, 1992; Hollen et al, 1993; Chang et al, 2000). Despite support for the validity of the VAS-I demonstrated in the studies cited above, there is evidence that VASs may be more difficult than other pain ratings for patients to understand and to complete (see also comparison studies below). For example, Bruera et al (1991) found that 16% of 101 palliative care patients were unable to complete a VAS-I, even with nurse assistance, and that this number increased to 84% as disease progressed.

Numerical Rating Scale of Cancer Pain Intensity A Numerical Rating Scale of pain intensity (NRS-I) consists of a range of numbers (usually 0 to 10, but sometimes 0 to 100 or other ranges). Respondents are told that the lowest number represents no pain and the highest number represents an extreme level of pain (eg, pain

5 as intense as you can imagine). They are asked to write down, circle, or state the single number that best represents their level of pain intensity. NRS-1 scales are used less often in research than VAS-1 scales. The findings of the research that has been performed supports the validity and reliability of NRS scales and indicates that their psychometric properties are very similar to those of VAS measures. For example, NRS-I scales tend to show very strong associations with VAS-I scales (Kremer et al, 1981; Wilkie et al, 1990; Paice and Cohen, 1997; Sze et al, 1998) and Verbal Rating Scales of intensity (Paice and Cohen, 1997). Only 2 studies have shown associations less than .70; one was by Chang et al (2000), which showed only moderate associations between NRS-I scales and a VAS-I, and the second was a study that showed a correlation coefficient of .59 between a NRS-I and a Verbal Rating Scale of intensity (Kremer et al, 1981). NRS-Is have also been shown to be sensitive to changes (increases) in pain associated with radiotherapy (Trotti et al, 1998) and physical therapy (Smith et al, 1998) and to decreases associated with pain treatment (Farrar et al, 1998; Grond et al, 1999; Holzheimer et al, 1999; Leksowski, 2000; Wilkie et al, 2000; Meuser et al, 2001). NRS-Is have demonstrated criterion-related validity through their significant and positive associations with analgesic medication use (Daut et al, 1983), perceived need to contact health care providers (Sandbloom et al, 2001), pain interference (Daut et al, 1983; Owen et al, 2000), dyspnea (Smith et al, 2001), and a number of additional specific symptoms such as nausea, dry mouth, dyspnea, lack of appetite, fatigue, and constipation (Chang et al, 1999), and negative associations with treatment satisfaction (Lin, 2000) and measures of global quality of life (Wang et al, 1999; Chang et al, 1999; Owen et al, 2000; Poulos et al, 2001; Sandbloom et al, 2001). Further support for the validity of 0-10 NRSs comes from Portenoy, Payne, Coluzzi, et al (1999), who found that the responses to this scale showed an appropriate dose response to treatment with oral transmucosal fentanyl citrate. In another study, a 0-10 NRS completed on one occasion predicted subsequent decreases in functioning among 93 persons with various cancer diagnoses (Dodd et al, 2001). De Wit et al (1999) showed that 86% of a sample of 156 patients with various cancer diagnoses were able to complete 2 months’ worth of daily diaries that included a 0-10 NRS. They found that patient ratings of average pain provided during interviews every 2 weeks showed strong associations with actual diary averages (r’s ranged from .80 to .91), which provides some support for the validity of retrospective ratings of average cancer pain. However, patient retrospective ratings tended to be higher by about 0.5 on the 0 to 10 scale, on average, than their actual average pain intensity was (as calculated from the diaries), calling into question the accuracy of retrospective rating of past pain by using 0-10 NRSs. Only one study examined the test-retest validity of NRS-Is and found very good stability for NRS-I ratings of worst pain (r ⫽ .93) and average pain (r ⫽ .78) but not for

6 current pain (r ⫽ .59) during about a 2-day period (Daut et al, 1993). The coefficients were much lower (.34, .24, and .22) when the time period was extended to about 91 days (Daut et al, 1993), although a high degree of stability in pain intensity ratings would not necessarily be expected during a 3-month period, because pain can change over time. Farrar et al (2000) performed a study that provides important validity data concerning the meaning of change in pain as defined by a 0-10 NRS. They operationalized a meaningful change in pain as that level of change that is associated with a patient not requiring a rescue dose as part of a titration phase of a clinical trial. They found that an absolute change of 2 points (out of 10) and a percent change of 33% in the 0-10 NRS showed the optimal sensitivity and specificity for detecting a meaningful change in pain in a sample of 130 patients with various cancer diagnoses. Although it will be important to replicate these findings in additional samples, these data do support the utility of 0-10 NRSs in particular, because such guidelines are not yet available for other measures of pain intensity (see below for their findings concerning a 4-point VRS of pain relief).

Verbal Rating Scale of Cancer Pain Intensity Verbal Rating Scales of pain intensity (VRS-I) consist of a list of descriptors or phrases (eg, none, some, moderate, severe) that represent varying degrees of pain intensity. Each word or phrase has a number associated with it (eg, none ⫽ 0, severe ⫽ 3). Respondents are asked to select the single word or phrase that best represents his or her pain level, and the respondent’s score is the number associated with the word chosen. In the cancer pain literature, the number of descriptors in VRS-Is range from 4 (eg, Bergman et al, 1994) to 8 (eg, Ingham et al, 1996). Like VAS-Is and NRS-Is, VRS-Is demonstrate sensitivity to changes in pain with treatment (Stambaugh and Sarajian, 1981; Tannock et al, 1989; Bergman et al, 1992; Bergman et al, 1994; Murphy et al, 1994; Ellershaw et al, 1995; Ingham et al, 1996; Tannock et al, 1996; Hammerlid et al, 1997; Farrar et al, 1998; Rogers et al, 1998; Portenoy, Payne, Coluzzi, et al, 1999; Molenaar et al, 2001) and show strong associations with other measures of pain intensity (Walsh and Leber, 1983; Littman et al, 1985; Fishman et al, 1987; Paice and Cohen, 1997; Rogers et al, 1998; Klepstad et al, 2000; but see Kremer et al, 1981, who found a correlation of only .59 between a VRS and 0-10 NRS). VRS-I ratings have also been shown to be associated with survival (Stockler et al, 1999; Tannock et al, 1996), tumor size (Rogers et al, 1998), analgesic use (Rogers et al, 1998), tumor stage (Rogers et al, 1999), disease stage (Cliff and MacDonagh, 2000), and anxiety about pain (Cliff and MacDonagh, 2000). Furthermore, decreases in VRS-Is are associated with decreases in tumor size in response to chemotherapy (Bergman et al, 1992). Only two studies examined the test-retest reliability of VRS-Is. One found the VRS-I to be quite stable over a matter of minutes (kappa ⫽ .71, Ellershaw et al, 1995), and a second found the NRS-I to demonstrate relatively

Cancer Pain Assessment low stability (r ⫽ .55, Sneeuw, Aaronson, Osoba, et al, 1997) during a 1-week period.

Other Single-Item Measures of Cancer Pain Intensity Single-item measures other than VAS-Is, NRS-Is, and VRS-Is are used much less often to assess cancer pain intensity. Measures that have been used include Mechanical Visual Analogue Scales, Graphic Rating Scales, Faces Scales, a Finger Dynamometer, and various combination scales. A Mechanical Visual Analogue Scale of pain intensity (M-VAS-I) is very similar to the VAS-I, except that instead of making a pencil or pen mark on a line on a paper, the respondent moves a slider between the 2 extremes of pain on a plastic or cardboard scale. The scale administrator then looks on the back of the scale and directly reads the distance that the slider was moved from a ruler. M-VAS-Is are very strongly associated with VAS-Is (r ⫽ .99, Grossman et al, 1992; r ⫽ .77, Ramer et al, 1999) and other pain intensity ratings (Geddes et al, 1990; Ramer et al, 1999). They are also highly reliable over a 5-minute period (r ⫽ .95, Grossman et al, 1992). In short, they appear to share many of the properties of VAS-Is. Graphic Rating Scales of pain intensity (GRS-I) are also similar to VAS-Is. The primary difference is that GRS-Is add specific markers along the VAS continuum with labels associated with each marker. For example, the GRS-I used by Greenwald et al (1987) consisted of a 100-mm line with the numbers 1 through 5 evenly spaced along the line and descriptors (no pain, slight pain, moderate pain, very bad pain, pain as bad as can be) below each number. Depending on the specific instructions, respondents to GRSs might circle the number or descriptor or make a mark on the line (by using the numbers or descriptors as guidelines) that best represents their pain intensity. Greenwald et al (1987) found the GRS-I they used was associated with diagnosis (persons with pancreas cancer reported greater pain than those with lung, prostate, and cervical cancer) and analgesic use. McMillan et al (1988) showed that a 0-10 GRS was sensitive to decreases in pain that occurred when a pain monitoring system in an inpatient cancer treatment center was established. No other study was identified that used GRS-Is to assess pain in persons with cancer. Face Scales of pain intensity present the respondent with drawings of facial expressions representing increasing levels of pain intensity and suffering. Respondents select the single drawing that best represents their pain level, and their score is the number (rank order) of the expression chosen. Two studies have used Face Scales with patients with cancer. Ramer et al (1999) found the Face Scale to show strong associations with other ratings of pain intensity (eg, r ⫽ .82 between the Face Scale and a VAS-I), and Shannon et al (1995) found that about 81% of their sample with various cancer diagnoses were able to complete the Face Scale (compared with 75% who were able to complete a VAS-I and 89% a VRS-I). These preliminary studies suggest that Face Scales could potentially be valid as measures of pain intensity. However,

CRITICAL REVIEW/Jensen Ramer et al (1999) did comment that some of the male patients in their study were uncomfortable with rating their pain at the highest level by using the Face Scale because the expression representing the most severe level of pain had tears on the face of the drawing. This raises the possibility that the Face Scale (or at least one that includes tears) may underestimate pain intensity in some patients with severe pain. Wilkie et al (1990) examined the validity of a Finger Dynamometer measure of pain intensity, with which respondents indicate their pain level by “squeezing as hard as you hurt.” However, this measure was shown to be only moderately associated with VAS-I and NRS-I measures (which were very strongly associated with one another), not supporting the Finger Dynamometer as a measure of pain intensity. Finally, different components of pain intensity measures can be combined into single scales (eg, combine numbers with descriptors making a NRS/VRS-I, Grossman et al, 1992; Campbell et al, 2000; Maunsell et al, 2000; or a diagram with descriptors, Sneeuw et al, 1999; Sneeuw, Aaronson, Sprangers, et al, 1997). The evidence from studies looking at NRS/VRS-Is suggests that they, too, are valid as measures of pain intensity, as shown by their strong associations with other measures of pain intensity (Grossman et al, 1992), association with analgesic use, pain interference, and measures of global quality of life (Maunsell et al, 2000), and association with treatment history and concern about cancer (Campbell et al, 2000). The two studies that used the diagram plus descriptor measure of pain did so in the context of examining the validity of proxy (clinician or caregiver) measures of patients’ pain, which is discussed in a separate section below.

Validity and Reliability of Multiple-Item Ratings of Cancer Pain Intensity By far the most frequently used multiple-item measure of pain intensity in cancer research is the Pain Intensity Scale of the Brief Pain Inventory (BPI).3 The BPI Pain Intensity Scale score is created by averaging four 0-10 NRS-Is (of current pain and worst, least, and average pain during a specified period, usually the past week) into a single pain intensity score. Six other composite pain intensity measures examined in this literature include (1) a composite score made up of three 0-10 NRSs of average, worst, and least pain (Cleary et al, 1995); (2) a 4-item Pain Scale from the Rotterdam Symptom Checklist (de Haes et al, 1990); (3) a 3-item Pain Scale from the EORTC QLQ-C36 (a precursor to the EORTC QLQ-C30, Sigurdarto´ ttir et al, 1993); (4) a 4-item measure of pain in the mouth, jaw, and throat area for patients with head and neck cancer (Bjordal et al, 1994); (5) another 4-item measure of pain to be used with patients with head and neck cancer (Terrell et al, 1997); and (6) a 3-item measure of pain associated with breast cancer (Stanton et al, 2001).

BPI Pain Intensity Scale Four questions on the BPI ask respondents to rate their current pain and past worst, least, and average pain on

7 0-10 NRS-Is, with “no pain” and “pain as bad as you can imagine” as the descriptive end points. Most of the studies that have examined this measure in populations of persons with cancer have used factor analyses to demonstrate that the BPI Intensity items load together onto a single factor (Cleeland et al, 1988; Cleeland and Ryan, 1994; Caraceni et al, 1996; Wang et al, 1996; Uki et al, 1998; Ger et al, 1999; Radbruch et al, 1999; but see Saxena et al, 1999, for a finding that the worst pain item loaded on both an intensity and an interference factor in 1 of 2 samples). These studies have also shown that the scale is highly internally consistent, with coefficient alphas ranging from .78 to .97 in a variety of samples of persons with cancer from many different countries (Serlin et al, 1995; Caraceni et al, 1996; Wang et al, 1996; Uki et al, 1998; Radbruch et al, 1999; Saxena et al, 1999; Mystakiou et al, 2001; Sandbloom et al, 2001). Research comparing the psychometric properties of the BPI Pain Intensity scale across cultures indicates that the BPI Pain Intensity items are relatively free of cultural and linguistic bias (Serlin et al, 1995). The BPI Pain Intensity Scale has also been shown to be associated with other measures of pain intensity (McMillan et al, 2000), performance status (Wang et al, 1996; Caraceni et al, 1999), pain interference (Wang et al, 1996; Ger et al, 1999; Radbruch et al, 1999), source of assessment (with physicians reporting lower pain levels on the scale than patients did, Larue et al, 1995), and nationality (Vietnamese patients reported higher pain levels than patients from the United States, which was expected given that they were receiving fewer analgesics, Cleeland et al, 1988). However, no study was identified that examined the sensitivity of the BPI Pain Intensity Scale to the effects of cancer pain treatment.

Other Multiple-Item Measures of Cancer Pain Intensity Clearly et al (1995) examined the psychometric properties of the Health-Related Quality of Life scale in a number of samples of men with advanced prostate cancer (total, 487 subjects). This measure includes three 0-10 NRS ratings of average, worst, and least pain intensity. The internal consistency coefficients (coefficient alphas) of these 3 items ranged from .89 to .92 across samples, and the composite pain intensity score made up of these items was associated significantly with measures of a variety of quality of life dimensions such as emotional wellbeing (r ⫽ ⫺.40), social functioning (r ⫽ ⫺.48), and physical capacity (r ⫽ ⫺.58) among others (Clearly et al, 1995). The Rotterdam Symptom Checklist asks respondents to rate the intensity of sore muscles, low back pain, headache, and abdominal aches, among 30 other symptoms, by using 0-3 Verbal Rating Scales. Factor analyses of the responses to this scale showed a clear pain factor emerging in only 1 of 3 samples of patients with cancer (de Haes et al, 1990). Although internal consistency (coefficient alpha, .81) of a scale made up of these 4 items was good in this 1 sample, the variability of the factor analysis findings suggests that multiple-item scales made up of items

8 representing pain in different sites may have limited reliability. Sigurdardo´ ttir et al (1993) examined the psychometric properties of a 3-item cancer pain intensity measure made up of a 1-4 NRS-I from the 36-item European Organization for Research and Treatment of Cancer Quality of Life Questionnaire for Cancer (EORTC QLQ-C36; a precursor to the EORTC QLQ-C30, which has since been used in many subsequent studies on quality of life among persons with cancer), and two 1-4 NRS-I items from a Malignant Melanoma assessment module that assess pain intensity with movement and with rest. The scale made up of these 3 items showed a good internal consistency (alpha, .81), was moderately associated with other measures of quality of life, and was associated significantly with types of metastases (superficial vs visceral)(Sigurdardo´ ttir et al, 1993). Bjordal et al (1994) describe the development of the Head and Neck cancer questionnaire, which includes 4 items that assess pain (jaw pain, mouth pain, mouth soreness, throat pain). Although reliability and validity statistics were not presented in the original scale development study, they did detail the scale development procedures and specifically interviewed head and neck cancer specialists to identify issues relevant to patients with head and neck cancer. These procedures support the content validity of the measure. Hammerlid and Taft (2000) subsequently showed that a scale made up of these 4 items predicted survival in a sample of 135 patients with head and neck cancer. It has also shown appropriate associations with measures of different aspects of quality of life, to have adequate internal consistency (coefficient alpha, .72), and to have good test-retest stability (r ⫽ .72) during a 1-month period (Zotti et al, 2001). Another multiple-item measure of pain associated with head and neck cancer was described by Terrell et al (1997). The Head and Neck Quality of Life Instrument (HN QOL) they developed includes two 5-point VRSs of the bothersomeness of mouth/jaw/throat pain and shoulder/neck pain, a 5-point scale of frequency of pain medication use, and a 5-point VRS of physical problems, all combined into a single pain scale composite score. The internal consistency (coefficient alpha) of this scale is adequate (alpha, .79), and its test-retest stability during a 5to 7-day period is high (r ⫽ .81). This measure was also shown to be associated with physical and psychologic functioning and with the 2-item SF-36 Bodily Pain scale (r ⫽ .66), further supporting the validity of the measure. A subsequent study provided further support for the HN QOL Pain Scale by showing positive associations between this scale and a rating of overall bothersomeness of pain (r ⫽ .63) and a univariate association with employment status (Terrell et al, 1999). Stanton et al (2001) described a 3-item measure of breast pain, shoulder stiffness, and breast sensitivity and showed that it was internally consistent (Cronbach alpha, .81) and associated significantly with quality of life, depressive symptoms, and fear of disease. In a subsequent study, this measure was not shown to be associ-

Cancer Pain Assessment ated significantly with cosmetic outcomes or arm edema (Krishnan et al, 2001).

Validity and Reliability of Measures of Cancer Pain Interference Pain interference refers to the extent to which pain interferes with day-to-day functioning. The most common measure of cancer pain interference is the Brief Pain Inventory Pain Interference Scale.3 This scale consists of 7 items that ask respondents to indicate the extent to which pain interferes with general activity, mood, walking ability, normal work, relations with other people, sleep, and enjoyment of life on 0-10 NRSs, with 0 ⫽ Does not interfere and 10 ⫽ Completely interferes. The responses to the 7 items are averaged to form the Pain Interference Scale score. Factor analyses of responses show that the 7 interference items load together onto a single factor (Cleeland et al, 1988; Cleeland and Ryan, 1994; Caraceni et al, 1996; Wang et al, 1996; Uki et al, 1998; Ger et al, 1999; Radbruch et al, 1999; Saxena et al, 1999; Mystakidou et al, 2001) and that the scale has excellent internal consistency (with alphas ranging from .78 to .91; Serlin et al, 1995; Caraceni et al, 1996; Wang et al, 1996; Radbruch et al, 1999; Saxena et al, 1999; Mystakidou et al, 2001). One study used multidimensional scaling to determine the factors underlying the BPI Pain Interference items in a large sample of 1843 persons with metastatic cancer (Cleeland et al, 1996). These analyses yielded 2 underlying interference dimensions: interference with activity (walking, work, general activity, sleep) and affectivityrelated interference (relations, mood, enjoyment of life), suggesting the possibility of alternate scoring and use of the BPI Pain Interference Scale. However, this alternate scoring has yet to be used or tested in additional samples of persons with cancer. The BPI Pain Interference Scale is associated, as would be expected, with measures of pain intensity (Daut et al, 1983; Ger et al, 1999; McMillan et al, 2000). However, there has yet to be an examination of the ability of this scale to detect changes in pain interference associated with cancer pain treatment. Another multiple-item pain interference measure was described by Ripamonti et al (2000). They performed a factor analysis on 2 VRSs of pain intensity (at rest and with movement), VRSs of pain’s effects on sleep, depression, nervousness, and concentration, and a number of additional ratings such as the perceived cause of pain, use of analgesic medications, and relief provided by analgesic medications. The 4 VRSs of pain impact/interference loaded together on a single factor, whereas the 2 intensity ratings did not. The 4 pain interference items had an adequate internal consistency (alpha coefficient, .73), and a scale score made up of an average of these items was significantly associated with pain relief. Maunsell et al (2000) asked 98 patients with various cancer diagnoses to complete a pain diary on a daily basis that also included a 5-item measure of pain interference. The internal consistency of this scale was large (range, .87 to .92; average, .91 across the 4 weeks of data collec-

CRITICAL REVIEW/Jensen tion), and the measure was associated with pain intensity ratings, number of analgesic rescue doses received, and a variety of dimensions of quality of life. Finally, a number of single-item ratings of pain interference have been examined in this literature. A quality of life measure used in some cancer research, the Functional Living Index, Cancer Scale (FLIC) includes a 1-7 GRS asking about the extent to which pain disrupts activity. This item loaded on a physical well-being factor in a factor analysis of the FLIC, providing further support for the association between pain interference and overall quality of life (King et al, 1996). Daut and Cleeland (1982) examined 2 of the NRS interference items from the BPI (activity and enjoyment of life) and showed that the association between pain intensity and pain interference, as measured by these items, is nonlinear. A change in worst pain intensity from a 4 (out of 10) to a 5 is associated with a larger increase in interference than any other change in pain intensity. Ferrans and Ferrell (1990) found that a 1 (not at all) to 7 (a great deal) NRS of pain interference was associated with a measure of global quality of life, and another 7-point NRS of pain interference tended to load more strongly with measures of physical disability than with measures of pain intensity, supporting an important linkage between pain interference and disability, as well as a distinction between pain interference and pain intensity (King et al, 1996). Klee et al (1997) administered a cancer-specific quality of life measure that includes a 4-point VRS of pain interference to 1041 women with breast or gynecologic cancer. They found that women with gynecologic cancer reported greater pain interference with this measure than women with breast cancer did. In addition, because a 4-point VRS of pain intensity did not discriminate these 2 diagnostic groups, the findings provide further support that pain intensity and pain interference are distinct dimensions of pain.

Validity and Reliability of Measures of Cancer Pain Relief Whereas pain intensity ratings ask patients to rate the intensity of felt pain, pain relief ratings ask patients to rate how much relief from pain they have experienced, usually in reference to a specific treatment or intervention. Relief ratings have been shown to be sensitive to the effects of treatment (VAS relief ratings: Wallenstein, 1991; Shannon et al, 1995; Manfredi et al, 2000; VRS relief ratings: Stambaugh and Saragian, 1981; Littman et al, 1985; Wallenstein, 1991). Also, in one study, relief ratings were strongly and negatively associated with pain intensity ratings (VAS relief rating, Fishman et al, 1987). However, in two other studies, the associations between pain relief and pain intensity measures were weak (VAS rating, Ramer et al, 1999; NRS rating, Daut and Cleeland, 1982). Supporting the validity of relief ratings as measures of change in pain intensity, some studies have shown positive associations between pain intensity change scores and relief ratings (VAS, Angst et al, 1999; NRS, De Conno

9 et al, 1994). Interestingly, however, the association between pain relief and change in pain intensity is not always strong, so ratings of these 2 constructs (change in pain, pain relief) appear to measure related but also distinct constructs. For example, Angst et al (1999) found that when pain intensity and pain relief were assessed 10, 20, and 30 minutes after an infusion (pain intensity was also assessed preinfusion), pain relief ratings tended to increase as pain intensity decreased. However, for many patients, pain relief ratings remained above 0 (indicating at least some relief) even when pain intensity returned to preinfusion levels. Similarly, de Wit et al (1999) demonstrated the distinction of a VRS rating of pain relief from pain intensity by performing a factor analysis of pain intensity ratings, a VRS rating of pain relief, and other measures. They found that the pain relief rating loaded with measures of treatment satisfaction and perceived adequacy of analgesia, but not with the pain intensity ratings. The findings of Farrar et al (2000) concerning the meaningfulness of change in pain as measured by a change in a 0-10 NRS were described above. These investigators also identified the specific rating of relief (with a 5-point VRS-R scale: none, slight, moderate, lots, complete) best associated with a meaningful change in pain. They found that the rating of moderate relief best represented meaningful change to the participants with cancer pain in their study, supporting this rating as a reasonable treatment outcome goal.

Validity and Reliability of Measures of Cancer Pain Site There are 2 common methods to assess pain site in pain research, a pain drawing and a pain site checklist. A pain drawing consists of an outline drawing of a human body, and respondents are asked to indicate on the drawing, usually by shading appropriate areas, the specific sites of pain or other sensations. Pain drawings can be coded by placing a transparency template over the drawing and coding whether the respondent shaded in specific areas.18 Site checklists simply ask respondents to indicate in which of a list of possible sites (eg, head, neck, back, arms) they experience pain. To date, only pain drawings have been used in cancer pain research. Zimmerman et al (1987) reported a “close correspondence” between patient-completed pain drawings and the anatomic locations of disease in a sample of 40 persons with various cancer diagnoses, although no statistical test of this correspondence was reported. Wilkie et al (1992) found that the number of pain sites was associated with pain intensity in a sample of 45 patients with lung cancer. However, other than a single additional study that examined the association between patient and nurse reports of pain site (discussed in the section on proxy measures of pain), no other study was identified that presented data concerning the psychometric properties of pain drawings in persons with cancer.

10

Validity and Reliability of Measures of the Temporal Aspects of Cancer Pain The temporal aspects of pain, such as its variability, frequency, and duration, as well as its pattern across time (over minutes, hours, days, or months) can be assessed by asking patients to rate their pain on multiple occasions over time. For example, Maunsell et al (2000) asked 98 patients with various cancer diagnoses to complete daily pain diaries with which they rated pain intensity and pain interference. They found that 86% of the subjects completed 1 week of diaries and that, of these, 90%, 84%, and 74% completed diaries in weeks 2, 3, and 4, respectively. Thus, they demonstrated in 1 sample of patients that the majority of patients were willing to complete daily pain diaries (see also de Wit et al, 1999). Data from such diaries can be coded to score many of the temporal aspects of pain listed above. Variability can be operationalized as the standard deviation of daily pain intensity ratings, frequency as the number of times pain intensity is above specific thresholds (eg, number of times pain intensity is greater than 0 or even greater than some level indicating moderate or severe pain24), and average duration as the average amount of time patients experience pain levels above specific cutoffs. Specific time patterns of pain within or across days can also be coded from these data (eg, no change over time, increases or decreasing over time14). However, not every clinician or researcher has the resources to be able to administer and code diary data, and not all patients will comply with requests for daily diary data. A more efficient way to assess pain frequency, duration, and variability, providing that research demonstrates patients are able to provide accurate responses, is to ask patients to recall and rate these temporal aspects of their pain experience. One temporal pain dimension assessed in the articles reviewed for this article was the frequency of pain. Kaasa et al (1995) used a 5-point VRS to measure the frequency of pain that ranged from “All day” to “Not at all.” They found that responses to this measure were strongly associated with a composite measure of pain intensity and pain interference as measured by the QLQ-C30 Pain Scale. Rathmell et al (1991) asked patients with head or neck cancer to rate the frequency of their pain on 4-point VRS with 1 ⫽ Never and 4 ⫽ Daily. Pain frequency, but not pain intensity (also measured by a 4-point VRS-I), was associated with type of treatment received, with patients who received both radiation and surgery reporting greater pain frequency than those who received radiation alone. Samarel et al (1996) showed that a combination 5-point NRS/VRS of pain frequency (1 ⫽ Never to 5 ⫽ Always) loaded with measures of pain intensity and pain upsetness into a single scale. This scale was subsequently found to be significantly associated with other symptoms, such as fatigue, and with treatments received (chemotherapy vs no treatments). These preliminary findings indicate that pain frequency is both related to, but also might be distinct from, measures of pain intensity. Portenoy, Payne, and Jacobsen (1999) defined break-

Cancer Pain Assessment through pain as an episode of severe or excruciating pain that occurs in the context of an ongoing (more that half of the time during waking) background pain. They found that the presence of breakthrough pain was associated with other important pain-related variables such as average intensity of background pain, pain interference, and measures of both depression and anxiety. Other than these 4 studies and despite the preliminary indications that pain frequency or variability may be important determinants of pain’s impact on function and quality of life, no other studies were identified that examined the validity or reliability of measures of the temporal aspects of cancer pain.

Validity and Reliability of Measures of the Qualitative and Affective Components of Cancer Pain Pain has many sensory and affective qualities in addition to its intensity component. The most common measure of these aspects of pain in persons with cancer is the McGill Pain Questionnaire, but the short-form McGill Pain Questionnaire and single-item ratings have also been used.

McGill Pain Questionnaire (MPQ) The MPQ consists of 78 pain descriptors classified into 20 categories of pain that can be scored to assess 4 major dimensions of pain: sensory, affective, evaluative, and miscellaneous pain, as well as a total pain severity score.20 Data support the conclusion that the MPQ qualitative scale scores assess something other than pain intensity. For example, Chung et al (2001) found very low associations between a pain intensity rating and both the MPQ-NWC (r ⫽ ⫺.09) and MPQ-PRI (r ⫽ .00). Other investigators have found stronger associations between MPQ scales and pain intensity ratings (Ahles et al, 1984, r’s range from .49 to .57; Graham et al, 1980, r’s up to .40, lowest r value not specified; Wilkie et al, 1992, r’s up to .58, lowest r value not specified; see also Kremer et al, 1982). Although these associations are usually positive, indicating that the MPQ scales and pain intensity assess related dimensions, they are not strong enough to support the conclusion that MPQ scales and pain intensity rating scales assess the same thing. Further evidence for a distinction between the MPQ scale scores and pain intensity ratings was found by De Conno et al (1994). They performed 2 factor analyses using a VAS-I, an NRS-I, a VRS-I, the MPQ-PRI score, and a composite measure of the frequency of 5 different qualities of pain obtained at 2 different points in time in 53 patients with various cancer diagnoses. A single factor emerged from each factor analysis, with the 3 pain intensity measures loading most strongly on this factor (factor loadings ranged from .79 to .92) and the MPQ-PRI showing a weak loading in one analysis (.39) and a stronger loading in the second (.72). Similarly, a factor analysis of change scores in these measures from one time point to the next, plus a 5-point rating of pain relief, resulted in a single factor, with the pain intensity change scores

CRITICAL REVIEW/Jensen showing stronger loadings (range, .80 to .83) and the MPQ-PRI score showing a weaker loading (.47) on this factor (De Conno et al, 1994). The MPQ scales have been found to be positively associated with analgesic medication use (Ahles et al, 1983), illness conviction (Dalton et al, 1988), quality of life (Schipper et al, 1984), and the MPQ-Total score has been shown to be sensitive to the effects of prednisone on prostate cancer pain (Tannock et al, 1989), supporting the validity of the MPQ scales as measures of pain. Support for the validity of the MPQ-Affective scale to assess the affective component of pain specifically was reported by Ahles et al (1983), who found that this scale was more strongly associated with measures of psychologic distress than with measures of pain intensity. Also, Kremer et al (1982) reported that cancer patients with low pain intensity report a greater affective component of their pain on the MPQ-Affective scale than patients with low back pain do, consistent with the hypothesis that cancer pain may have greater affective associations (eg, be more worrisome and cause more fear) than low back pain. Only two studies have examined the reliability of the MPQ in patients with cancer. Both studies found that patients with cancer are generally consistent in the MPQ words they use to describe their pain from 1 week to the next (Graham et al, 1980; Walsh and Leber, 1983). Concerning utility, one study found the MPQ to be difficult for most persons with terminal cancer receiving palliative care to use (Talmi et al, 1997). However, a second study found that 84% of a sample of patients with cancer were able to complete the MPQ (Shannon et al, 1995).

Short-Form McGill Pain Questionnaire (SF-MPQ) The SF-MPQ consists of a subset of 15 descriptors from the MPQ drawn from the sensory and affective categories.21 Responses to the 15 SF-MPQ items can be scored to form a total SF-MPQ score as well as both Sensory and Affect SF-MPQ subscale scores.21 Only one study has examined the psychometric properties of the SF-MPQ in persons with cancer. Hollen, Gralla, Kris, Cox, Belani, et al (1994) administered the SF-MPQ to 207 patients with lung cancer and found that the 15 SF-MPQ items were highly internally consistent (coefficient alpha, .91) and that the 2 SF-MPQ subscales were strongly associated with one another. This latter finding suggests the possibility that the 2 SF-MPQ scale scores may tap into a similar underlying construct.

Other Measures of Pain Affect Ahles et al (1984) asked 37 patients with various cancer diagnoses to rate the affective component of pain 4 times/day for 7 days by using a VRS of affective pain descriptors (VRS descriptors compiled by Tursky et al30). The study participants were also administered 2 measures of pain intensity (a VAS-I and an NRS-I). Although the 2 intensity measures were strongly associated with one another, the VAS-I showed only a weak to moderate association (r ⫽ .30) with the VRS-Affective rating, sug-

11 gesting support for the conclusion that these measures are tapping into different components of pain. On the other hand, a different VRS of pain affect showed a very strong association with a mechanical VAS among 17 patients with Hodgkins’ lymphoma (r ⫽ .91; Gaston-Johansson et al, 1992). In this same study, the pain affect VRS showed associations with a number of criterion measures that were very similar to those shown by both a mechanical VAS and a VRS of pain intensity (Gaston-Johansson et al, 1992). The low number of subjects in this study may have been at least partially responsible for the strong association found between the VRS of pain affect and the measures of pain intensity used. Spiegel et al (1983) administered a 0-10 NRS of pain intensity and a 0-10 NRS of pain suffering to 86 women with breast cancer. They found that the 2 NRS scales were very strongly associated with one another (r ⫽ .81). They also found that the NRS of pain affect was significantly associated with measures of maladaptive coping, emotional distress, and use of analgesics. Smith et al (1998) also administered 0-10 scales of pain intensity and pain affect (0 ⫽ Not unpleasant at all, 10 ⫽ As unpleasant as you can imagine) to 32 patients with various cancer diagnoses and found that physical therapy increased the intensity rating but not the unpleasantness rating of pain. Such a finding supports the distinction between pain intensity and pain unpleasantness, even though measures of these 2 dimensions of pain may be strongly associated with one another.12 Finally, Price et al (1987) examined the ability of a VAS of pain intensity and pain affect to distinguish between different diagnostic groups. They found that a sample of patients with cancer (and patients with low back pain and causalgia) showed a significantly larger difference between the intensity and unpleasantness ratings than patients with upper back pain, myofascial pain, labor pain, or orofacial pain did. This further supports the distinction between the affective and intensity components of pain and the ability of the VAS to assess each pain component separately.

Validity and Reliability of Composite Scores of Pain Intensity and Pain Interference Although pain intensity and pain interference are distinct dimensions of cancer pain, 2 commonly used measures of quality of life that have been used in samples of patients with cancer, the EOTC QLQ-C30 and the SF-36, combine individual ratings of each of these dimensions into single composite pain scales.

European Organization for Research and Treatment of Cancer Quality of Life Questionnaire for Cancer (EORTC QLQ-C30) The EORTC QLQ-C30 is a 30-item measure developed to assess multiple dimensions of quality of life in persons with cancer.1 In addition to assessing 6 domains of quality of life, the QLQ-C30 assesses 7 symptom domains, including pain. The Pain symptom scale contains two

12 4-point VRSs that assess pain intensity and pain interference, which are averaged and then transformed into a 0 to 100 scale of pain severity. The QLQ-C30 Pain Scale usually demonstrates adequate to excellent internal consistency (alpha coefficients range from .70 to .89) across a great variety of patients with cancer (Aaronson et al, 1993; Bergman et al, 1994; Bjordal and Kassa, 1992; Fosså, 1994; Osoba et al, 1994; Ringdal and Ringdal, 1993; Wisløff et al, 1996; Sneeuw et al, 1998; Fosså, 2000), although one study in 120 patients with various cancer diagnoses showed marginal internal consistencies at 2 time points (alphas, .57 and .56; Kyriaki et al, 2001). Also, among a sample of 103 patients with brain cancer, the internal consistency of the QLQ-C30 Pain Scale appears to be reduced (alpha coefficient, .59) as is test-retest stability over a 1-week period (r ⫽ .44; Sneeuw, Aaronson, Osoba, et al, 1997). However, Hjermstad et al (1995) found the QLQ-C30 Pain Scale to have excellent test-retest stability (r ⫽ .86) over a 4-day period among 190 patients with various cancer diagnoses (Hjermstad et al, 1995). The QLQ-C30 Pain Scale has demonstrated appropriate associations with various criterion measures, including other dimensions of quality of life (Aaronson et al, 1993; Bjordal and Kassa, 1992; Bjordal et al, 1999; Fosså, 1994; Kassa et al, 1995; King et al, 1996; Kramer et al, 2000; Osoba et al, 1994; Ringdal and Ringdal, 1993; Rogers et al, 2000; Stockler et al, 1999), dispositional optimism (Allison et al, 2000), functional status (Aaronson et al, 1993; Stockler et al, 1999; Jordhøy et al, 2001), disease stage (Wisløff et al, 1996; Bjordal et al, 1999), clinical severity of cancer (King et al, 1996), survival (Kramer et al, 2000; Stockler et al, 1999; Camilleri-Brennan and Steele, 2001; Jordhøy et al, 2001), prognosis (Ringdal et al, 1994), presence of metastases (Osoba et al, 1994), performance status (Osoba et al, 1994), change in disease status (Wisløff et al, 1996), tumor size (Rogers, Lowe, et al, 1998), treatment received (patients who received esophagectomy reported less pain than those who received palliative care, Blazeby et al, 1995), and tumor response (Geels et al, 2000). This measure predicts cancer diagnosis (patients with oral cancer report greater levels of pain than patients with other cancer diagnoses, Bjordal et al, 1999) and has demonstrated sensitivity to change in pain over time and with treatment (Bjordal et al, 1999; Hammerlid et al, 1997; Rogers, Lowe, et al, 1998; Ilson et al, 1999; Roszkovski et al, 2000; Camilleri-Brennan and Steele, 2001; Fosså et al, 2000; Fosså et al, 2001; Kyriaki et al, 2001). The QLQ-C30 has also shown appropriate associations with patient ratings of global change in pain (Sneeuw et al, 1998). The QLQ-C30 Pain Scale has shown a sensitivity to differences in outcome with treatment (Tannock et al, 1996; Moore et al, 1994; Klepstad et al, 2000; Langendijk et al, 2001) and to increases in pain caused by interferon (Wisløff and Gulbrandsen, 2000). As would be expected, the QLQ-C30 Pain Scale is associated with measures of pain intensity (mostly in the .40 to .61 range, Bjordal et al, 1999; Geddes et al, 1990; Stockler et al, 1999; Klepstad et al, 2000; but see Kaasa et al, 1995, for a study describ-

Cancer Pain Assessment ing stronger associations of rs ⫽ .71 to .85). The strength of these associations is not generally as strong as those typically seen between measures of the pain intensity ratings, supporting the conclusion the QLQ-C30 Pain scale is not necessarily a valid measure of pain intensity. On the other hand, one study found the association between the QLQ-C30 Pain Scale and the SF-36 Bodily Pain Scale (which is also a composite measure of pain intensity and pain interference) to be very strong (r ⫽ .83; Rogers, Lowe, et al, 1998), supporting the validity of both scales as composite measures of pain intensity and pain interference.

SF-36 Bodily Pain Scale The second composite measure of pain intensity and pain interference that has been used with patients with cancer (although less so than the QLQ-C30) is the Bodily Pain Scale of the SF-36.31 The SF-36 is a measure of various quality of life dimensions and contains a pain intensity item and a pain interference item that are combined into a single composite Bodily Pain score. In support of the criterion validity of the SF-36 Bodily Pain scale, Broeckel et al (2000) showed that cancer patients who had been successfully treated with chemotherapy (ie, they did not have current evidence of disease) reported higher levels of pain on the SF-36 Bodily Pain scale than a noncancer control sample did. As with the QLQ-C30 Pain Scale, the SF-36 Bodily Pain scale has shown variable but rarely strong associations with measures of pain intensity, suggesting that the SF-36 Bodily Pain scale is not a highly valid measure of pain intensity alone (r ⫽ .59, Radbruch et al, 1999; r ⫽ ⫺.16, GastonJohansson et al, 1999). The SF-36 has shown significant associations with measures of global quality of life (Rogers, Humphris, et al, 1998; Rogers et al, 2000), tumor stage (Rogers, Humphris, et al, 1998), survival (CamilleriBrennan and Steele, 2001), and tumor size (Rogers, Lowe, et al, 1998). In one study examining the sensitivity of the SF-36 Bodily Pain Scale for detecting the differences between fluoxetine versus desipramine in the treatment of women with various cancer diagnoses, the investigators found that a VAS-I, but not the SF-36 Bodily Pain Scale, was sensitive to differences between the treatment conditions (Holland et al, 1998). Such a finding is consistent with the conclusion that a composite score of pain intensity and pain interference is somewhat distinct from measures of pain intensity alone and that the SF-36 Bodily Pain scale may be less sensitive than measures of pain intensity to treatment effects (see below for more detailed review of studies that compare different cancer pain measures on sensitivity to treatment effects). Another study found that the SF-36 Bodily Pain scale was sensitive to changes in pain from before to after surgery for oral or oropharyngeal cancer (Rogers, Lowe, et al, 1998). A third study showed the SF-36 to be sensitive to change in pain after mesorectal excision in 70 persons with rectal cancer (Camilleri-Brennan and Steele, 2001). These latter studies indicate that the SF-36 Pain Scale is

CRITICAL REVIEW/Jensen able to detect changes in pain related to treatments on some occasions.

Validity and Reliability of a Composite Measure of Pain Experience and Pain Cognitions Arathuzik (1994) describes the development of the Pain Inventory, a 38-item scale that includes items assessing a variety of cancer pain dimensions such as the temporal aspects of pain, pain quality, pain intensity, pain distress, pain interference, and pain-related cognitions. A primary strength of this measure was that its development included procedures for examining and then increasing the content validity of the items through an initial review of potential items by 40 nurses for their relevance to pain assessment. As a result of the review and based on feedback from the nurse experts, 13 items were subsequently either clarified or added to the measure. The Pain Inventory was shown to have good split-half reliability (r ⫽ .85) and internal consistency (Cronbach’s alpha ⫽ .84) in 2 samples of women with breast cancer (Arathuzik, 1994). Also, the total Pain Inventory score was significantly and positively associated with pain intensity (r ⫽ .37), pain distress (r ⫽ .61), and negatively with use of pain coping strategies (r ⫽ ⫺.22). Unfortunately, however, by combining so many pain dimensions (eg, pain quality, pain intensity, distress, pain interference, cognitions) into a single scale score, the meaning of the total score is obscured. Also, the measure has yet to be used in any other study. However, the domains of pain identified by the author do support the importance of considering a variety of pain dimensions when developing assessment protocols and provide some guidance concerning those cancer pain dimensions deemed most important by the experts surveyed in this study.

Results Concerning Proxy Measures of Cancer Pain A line of research on cancer pain assessment concerns the extent to which persons other than the patient, such as family members, caregivers, or clinicians, are accurate in their estimates of patient pain. Research on the validity of proxy measures is important to consider in this review because they provide a potential source of pain assessment if patients are unable to provide pain ratings. Geddes et al (1990) asked 53 patients with lung cancer and their nurses to rate patient pain (among other symptoms) by using a 4-point VRS/NRS in the context of a randomized controlled trial. They found strong agreement between the 2 sources of data (kappa ⫽ .76, percent complete agreement ⫽ 71%). Hollen et al (1993) reported that correlation coefficient between a patientrated VAS-I and a clinician-rated VAS-I in 52 patients with lung cancer was “greater than .75.” Stro¨ mgren and colleagues also reported a high concordance rate between patient self-report of pain by using the QLQ-C30 and physician notes (Stro¨ mgren, Groenvold, Perdersen, et al,

13 2001) and between patient report and nurse records on the presence of pain in the medical record of patients with various cancer diagnoses (Stro¨ mgren, Groenvold, Sorensen, et al, 2001; see also Velikova et al, 2001). On the other hand, Grossman et al (1991) found much weaker associations between patient and clinician (nurse, house officer, or medical oncology fellow) ratings of pain by using a VAS-I (r’s ⫽ .35 to .46). Sneeuw, Aaronson, Osoba, et al (1997) also found only weak to moderate associations between patient-rated composite scores of pain intensity and pain interference (with the QLQC30 Pain Scale; r ⫽ .23) and headache (with a 4-point VRS of headache; r ⫽ .57) and scores from a “significant other” in a sample of 103 patients with brain cancer. In a similar study with 307 patients with various cancer diagnoses, Sneeuw et al (1998) found a fairly strong (r ⫽ .63) association between patient and significant other composite scores from ratings of pain intensity and interference (with the QLQ-C30 Pain Scale), with significant others tending to overestimate patient pain on average. In a third study, these investigators found similar associations between patient and proxy measures of pain intensity between patients and a caregiver (often the spouse; r ⫽ .53 and .71 at 2 assessment points) and between patients and physicians (r ⫽ .64 and .72 at 2 assessment points; Sneeuw, Aaronson, Sprangers, et al, 1997). Physicians tended to rate patient pain as significantly lower and caregivers tended to rate patient pain as significantly higher than patients did. These findings were mostly replicated in a fourth study, with the associations between patient and significant other, patient and physician, and patient and nurse ratings all between .50 and .66 (Sneeuw et al, 1999). In this later study, significant others also tended to overestimate patient pain and physicians underestimate patient pain, although nurse ratings were not significantly different than patient ratings. Cliff and MacDonagh (2000) found that significant others overestimated patient pain intensity in a sample of 164 men with prostate cancer. Deschler et al (1999) also found that caregivers tended to overestimate patient pain by using the SF-36 Bodily Pain scale, although spouses provided more accurate estimates than nonspouse caregivers. Wilson et al (2000) also found that partners provide more accurate estimates of patient pain than physicians did in a sample of men with prostate cancer (by using the QLQ-C30), but the associations between patient and proxy measures were not that large (partner, r ⫽ .47; physician, r ⫽ .17). Moreover, these associations were weak for both partners (r ⫽ .16) and physicians (r ⫽ .29) in a separate sample of women with breast cancer (Wilson et al, 2000). In both samples, physicians overestimated pain, on average, whereas partners only overestimated pain in the sample of patients with breast cancer. Nekolaichuk, Bruera, et al (1999) found that physicians underestimated patient pain (with a VAS-I), as did nurses, although the difference between nurse and patient ratings was not statistically significant. The associations between patient and nurse, and patient and phy-

14 sician ratings were not very strong (range, .40 to .62). Similarly, Larue et al (1995) found that, across 20 different treatment centers, physicians tended to underestimate patient pain (with 0-10 NRSs). In a study designed to examine the factors that contribute to patient pain scores in 32 patients with advanced various cancers, Nekolaichuk, Maguire, et al (1999) found a high level of variability between patient, nurse, and family caregiver ratings of pain by using the VAS-I. Hovi et al (1999) found no significant differences, on average, between patient and nurse VAS-I ratings of patient current and least pain, but nurses significantly underestimated patient worst and acceptable pain. On the MPQ,20 patients used more words to describe their pain, and on a pain drawing, patients described more pain locations, than nurses did (Hovi and Lauri, 1999). Madison and Wilkie (1995) administered the MPQ, a pain drawing, and a VAS-I to 18 patients with lung cancer and a family member or friend. Although the low number of subjects in this study makes definitive conclusions difficult, they found only a weak association between the 2 sources of data in number of pain sites (rho ⫽ .26), and that patients reported more sites. The VAS-Is showed only a moderate association between the 2 sources (rho ⫽ .50 and .48 at 2 assessment points), and the specific MPQ subscales showed only weak to moderate associations (rhos ⫽ .10 to .42). Miaskowski et al (1997) found that only 29% of family members provided ratings of patient pain (on a VAS-I) that were congruent (within ⫾ 10 mm on a 100-mm VAS) with the patient ratings. In a preliminary study examining the effects of patient coaching on cancer pain assessment, Wilkie et al (1995) found that patient coaching improved the congruence between patient and nurse ratings and descriptions of patient pain. However, even with coaching, they found a high level of discordance between patient and nurse VAS ratings.

Results of Cancer Pain Measure Comparison Studies Although information concerning the psychometric properties of cancer pain measures can be obtained when only a single measure is used in a study, more useful information can be obtained from studies that use more than 1 measure. By administering 2 or more measures of pain in the same study and with the same population, it is possible to directly compare their psychometric properties without having to be as concerned about possible confounds due to differences between studies (eg, differences in samples and procedures) that can impact the estimates of a measure’s validity and reliability. Unfortunately, only a few studies provide such comparison data. The validity criterion most often compared in such studies is that of sensitivity to change with time or with treatment. In these studies, sensitivity is usually operationalized as a statistic that reflects the effect size for detecting a change in pain (eg, pretreatment to posttreatment) or a difference between treatment and con-

Cancer Pain Assessment trol conditions. Relevant statistics include the t statistic, the F statistic, the P value associated with these statistics, or some measure of change divided by a measure of variance (eg, lambda). Larger t and F statistics and smaller P or lambda values indicate greater sensitivity. Wallenstein (1991) performed a reanalysis of 11 random controlled trials (RCTs) of analgesics for cancer (2 RCTs) and postoperative (9 RCTs) pain that had samples ranging from 40 to 339 (total number of subjects across all trials, 1863; total number with cancer, 347). In both samples of patients with cancer, VAS measures of pain relief were more sensitive to detecting treatment effects than VRS measures of pain relief were. Stockler et al (1998) compared the relative sensitivity of a VAS-I, a 6-point VRS-I, and the QLQ-C30 Pain Scale for differentiating the effects of palliative chemotherapy with intravenous mitoxantrone plus oral prednisone versus oral prednisone alone in 143 men with prostate cancer. They found that all 3 measures were able to detect the beneficial effects of palliative chemotherapy, although the VAS-I was more sensitive that the other 2 measures, and the QLQ-C30 was more sensitive than the VRS-I. Gaston-Johansson et al (1992) administered a 15-item measure of sensory pain (consisting of words that describe a variety of pain sensations such as sharp, dull, and sore), a mechanical VAS of pain intensity, and an 11-item measure of pain affect (consisting of 11 descriptors such as annoying, nagging, and miserable) to 17 patients with Hodgkins’ lymphoma. These scales were administered before and then on 3 occasions after undergoing autologous bone marrow transplantation. They found the VAS-I to be slightly more sensitive to changes in pain than either the measure of sensory or affective pain, although the low N of this study limits the confidence that can be placed on the findings. Holland et al (1998) administered a VAS of pain intensity, a VAS of pain relief, an 8-point VRS of pain intensity, and the SF-36 Pain Scale to 38 patients with a variety of cancer diagnoses before and after 6 weeks of either fluoxetine or desipramine to treat depression. Only the VAS-I showed significant effects of fluoxetine on pain. Moore et al (1994) administered a 6-point VRS of pain intensity, a VAS-I, and the QLQ-C32 Pain Scale (which is a composite score of pain intensity and pain interference) to 27 men with prostate cancer before and after receiving mitoxantrone with prednisone. All 3 pain measures were sensitive to the expected decreases in pain over time, although it appears that the VRS-I may have been slightly more sensitive than either the VAS-I or the QLQ-C30 composite measure of pain intensity and pain interference. In a pretest-post-test study of the effects of prednisone on pain associated with prostate cancer, both the MPQ-PRI (P ⫽ .009) total score and a 6-point VRS-I (P ⫽ .011) were able to detect treatment effects, whereas a VAS-I was not (P ⫽ .12). Kucuk et al (2001) found very similar effect sizes for 3 measures (a 5-point VRS-I, a VAS-I, and the Short-Form MPQ Total score) in an analgesic trial. Littman et al (1985) performed a reanalysis of 23 RCTs of analgesics for postoperative, cancer, acute trauma, or renal or ureteral colic pain (total number of subjects,

CRITICAL REVIEW/Jensen 1330; total number with cancer not specified). They found that 3 scales (VRS-I, VAS-I, and a VRS of pain relief) were similarly sensitive, although the relief ratings tended to show slightly greater sensitivity than VAS-I difference scores did, and VAS-I difference scores showed slightly greater sensitivity than VRS-4 difference scores did. Consistent with the finding that relief scales may be more sensitive to change than intensity difference scores, Ohnhaus and Adler (1975) used a 5-point VRS of pain relief and a VAS of pain intensity in a doubleblind crossover study comparing 2 analgesics to placebo. Although neither of the measures detected significant treatment effects, they found that the relief measure was more sensitive than the VAS was. Similarly, Portenoy, Payne, Coluzzi, et al (1999) found that a 4-point VRS of pain relief was more sensitive than a 0-10 NRS of pain intensity for detecting the effects of oral transmucosal fentanyl citrate in a sample of 65 patients with various cancer diagnoses. On the other hand, Du Pen et al (1999) found that a 0-10 NRS of usual pain intensity (1 of the BPI pain intensity items) was more sensitive than a 0% to 100% rating of pain relief (or a 0-10 NRS of worst pain, number of pain sites, or quality of pain) for detecting the effects of a cancer pain treatment algorithm. Stambaugh and Sarajian (1981) used a 5-point VRS of pain relief, a VAS-I, and a 5-point VRS-I to determine the effects of 2 analgesics compared with placebo in a double-blind crossover study. In this study, all 3 measures were essentially equivalent in their ability to detect treatment effects. A second criterion on which various pain measures have been compared is the frequency of failure rates and preferences for the different pain measures. Littman et al (1985), who performed a reanalysis of 23 RCTs, also reported on the frequency of missing data in these clinical trials. Of the 167 subjects in these studies who were missing data, 93 (56%) were missing data on all scales (VAS-I, VRS-I, VRS-Relief). However, most of the rest (63, 44%) were missing data only for the VAS-I. Kremer et al (1981) examined the preference and failure rates of a VAS-I, a 0-100 NRS-I, and a 5-point VRS-I among 50 patients seen at a pain clinic, 32 of whom had cancer. They also found that the VAS had the highest failure rate (11%) and that the failure rates for the 0-100 NRS (2%) and VRS (0%) were very low. The mean age of the persons unable to complete the VAS (73.3 years) was significantly higher than those who were able to complete this measure (54.4 years). In this study, the VRS was the scale most preferred (by 53% of the patients with cancer), followed by the 0-100 NRS (25% of those with cancer), and the VAS was least preferred by the patients with cancer (16%). Mostly replicating the findings of Kremer et al, Paice and Cohen (1997) compared the preference and failure rates of a VAS-I, 0-10 NRS-I, and 5-point VRS-I in 50 patients with various cancer diagnoses. Although 10 (20%) of their subjects were unable to complete the VAS, all were able to complete the VRS and NRS. Moreover, mean opioid intake was significantly higher for subjects unable to complete the VAS than for those who were able to

15 complete this measure. They found that half (50%) of the patients preferred the 0-10 NRS, but that many (28%) also preferred the VRS over the other scales. Only 6 (12%) of the subjects preferred the VAS over the other scales. Shannon et al (1995) administered the MPQ, 3 VAS scales (of pain intensity, pain relief, and mood), a VRS of pain intensity, and a Face Scale to 63 inpatients with cancer. Again, the VAS scales evidenced the highest failure rate, with 89% able to complete the VRS, 84% the MPQ, 81% the Face Scale, and 75% the VAS scales. Soh and Hui-Gek (1992) asked 79 patients with various cancer diagnoses to complete a VAS-I and a VRS-I. Although they did not report specific failure rates, they did comment that the VAS was more difficult to explain to patients than the VRS was. Sze et al (1998) administered a VAS-I and a 0-10 NRS-I to 95 patients with various cancer diagnoses. Again, the failure rate for the VAS-I (14%) was higher than for the NRS-I (3%). On the other hand, Tannock et al (1996) found a 6-point VRS-I and a VAS-I to have similar failure rates (8% and 11%, respectively) in a sample of 136 men with prostate cancer. Cognitive impairment may interfere with the comprehension and use of pain rating scales, although it may impact the use of some scales more than others. Radbruch et al (2000) administered a Mini Mental Status Examine (MMSE) to 108 patients with advanced cancer in a palliative care unit and also attempted to administer the BPI intensity and interference items (all 0-10 NRSs) to these patients. If the patients were unable to complete the BPI, they were asked to scale the intensity of their pain on a 4-point VRS (none, mild, moderate, severe). If they were unable to use the 4-point VRS, they were simply asked to confirm the presence or absence of pain (ie, a 2-point VRS-I) along with other symptoms. Radbruch et al found that only 75% of the patients were able to complete the 0-10 intensity items and 62% the 0-10 interference items. Moreover, the number of missing responses for the BPI intensity items (r ⫽ ⫺.64) and interference items (r ⫽ ⫺.47) were both associated significantly with the MMSE score, indicating that a patient’s degree of cognitive impairment impacts his or her ability to respond appropriately to 0-10 NRS scales. However, many of the patients unable to complete the BPI 0-10 NRS items were able to complete a 4-point VRS of pain intensity, and all of the patients, even those who could not rate their pain by using a 4-point VRS, were able to report on the presence or absence of pain.

Discussion The results of this review summarize what is now known concerning the validity and reliability of existing measures of cancer pain. The findings support the multidimensional nature of cancer pain and provide varying degrees of support for the validity and reliability of measures of pain intensity, pain interference, pain relief, temporal pain patterns, pain quality (including affective qualities of pain), composite measures of pain intensity and pain interference, and proxy measures of patient cancer pain. The findings also provide guidance for re-

16 searchers and clinicians concerning which measures may have the most utility and suggest avenues of future research that will help to clarify the psychometric properties of cancer pain measures.

Measuring Cancer Pain Intensity There are several conclusions that may be drawn from the findings of the research on the psychometric properties of pain intensity measures. First and most importantly, each of the commonly used ratings of pain intensity, including the VAS-I, the NRS-I, and the VRS-I, all appear adequately valid and reliable as measures of pain intensity among the many different samples of persons with cancer. Other pain intensity rating scales (eg, Mechanical Visual Analogue Scales, Graphic Rating Scales) are used less often, but the research that has been performed by using these measures generally supports their validity as well. Moreover, no one scale consistently shows greater sensitivity than any other in their ability to detect changes in pain. Of all the individual pain intensity ratings examined, only 1, a Finger Dynamometer, showed poor psychometric properties. Although reliability is an important issue for pain intensity measures, as it is for any measure, reliability can be difficult to determine for single-item measures of pain. Internal consistency, 1 of the most common measures of reliability, cannot be computed from single-item rating scales. Also, test-retest stability coefficients for measures of pain may not always reflect reliability, because pain can, and often does, change from one moment to the next. Such changes in pain can reduce the test-retest reliability coefficient even for pain measures that are highly reliable. For these reasons, a pain intensity measure’s validity coefficients (eg, associations with other measures of pain intensity and with important criterion measures) are more important criteria than a measure’s reliability coefficient(s). As indicated above, the findings from the studies reviewed support the criterion validity of all commonly used ratings of pain intensity. There do appear to be consistent and important differences between VRSs, NRSs, and VASs in terms of their failure rates and in patient preference, however. VASs usually show higher failure rates than NRSs and VRSs, and NRSs tend (when differences are found) to show slightly greater failure rates than VRSs. Similarly, VRSs and NRSs tend to be preferred over VASs by patients. Higher failure rates with VASs have been shown to be associated with older age and greater amount of opioid intake, and mental impairment has been shown to be associated with inability to complete 0-10 NRS ratings of pain intensity and pain interference. Many patients unable to complete 0-10 NRSs appear to be able to complete 4-point VRSs, however. As a group, these findings suggest that VAS ratings may not be the best choice for assessing cancer pain intensity, especially among patients who are elderly or who may be using opioid medications. NRSs, on the other hand, appear to be well tolerated by most patients and appear to be at least as sensitive and valid as the more traditional VAS rating scales. Ten-point NRSs also

Cancer Pain Assessment have the advantage of the existence of data that help clarify the meaning of specific ratings and NRS change scores,7,8,24 information that is directly tied to treatment decisions according to current treatment guidelines.13,33,34 However, if the population is expected to include patients with significant cognitive impairment, a more simple 4-point VRS (eg, no, mild, moderate, or severe pain) may be the best choice. In addition to the more frequently used single ratings of pain intensity, pain intensity may also be assessed by using multiple-item scales. The multiple-item intensity measure used most commonly among patients with cancer is the Pain Intensity Scale of the BPI, which consists of an average of ratings of current and of least, worst, and average pain over a specific time period (eg, the past 24 hours) into a single summary score.3 This composite measure has shown excellent psychometric properties, including high internal consistencies and criterion-related validity. A similar composite score of average, worst, and least pain ratings has demonstrated similar psychometric strengths (Clearly et al, 1995). Although the sensitivity of multiple-item scales of pain intensity to the effects of cancer pain treatments has not yet been determined, there are several reasons to expect that research will prove these measures to be sensitive when tested. First, composite measures of pain intensity, like the BPI, are made up of individual ratings that themselves have shown sensitivity to changes in cancer pain. For example, one of the BPI intensity items (usual pain) was shown to be sensitive to the effects of a pain treatment algorithm (versus standard care) in a sample of patients with various cancer diagnoses, and another BPI intensity item (worst pain) was sensitive to changes in cancer pain over time6 (see also16). In addition, research by using samples of patients with pain problems other than cancer have shown the BPI Intensity scale to be sensitive to the effects of pain treatment.25,27 What is less certain is whether a composite pain intensity score shows improved psychometric properties when compared with individual pain intensity ratings. Other composite measures of pain intensity either combine pain from multiple pain sites into a single score (eg, de Haes et al, 1990) or combine pain ratings from sites associated with specific cancer diagnoses (eg, head and neck cancer pain scale that assesses jaw pain, mouth pain, mouth soreness, and throat pain, Bjordal et al, 1994). The findings concerning these other composite pain intensity measures indicate that the measure made up of ratings from divergent pain sites may have limited reliability, because having pain at one site (low back) may or may not be associated with having pain at a different site (headaches) (de Haes et al, 1990). Trying to combine the intensity of pain at these different sites into a single score may therefore lose important information. On the other hand, the findings concerning cancer diagnosis-specific measures provide some support for the reliability of assessing pain intensity in and around sites associated with a specific cancer diagnosis, in that these composite scales are internally consistent and demonstrate criterion-related validity.

CRITICAL REVIEW/Jensen In terms of recommendations for future research, there does not appear to be a strong need for future studies to determine the psychometric properties of single-item ratings of pain intensity. The many findings that are available provide a fairly clear picture concerning their validity and reliability. However, there is a need to determine whether anything is gained by using multipleitem measures of pain intensity (eg, the BPI Pain Intensity scale or the Head and Neck Cancer Pain Questionnaire Pain Intensity scale) instead of single-item ratings. Theoretically, multiple-item scales should be more reliable and therefore potentially more sensitive for detecting changes in pain associated with time or with treatment.22 Future studies may test this hypothesis by comparing multiple-item measures with single pain intensity ratings in the context of longitudinal studies and clinical trials.

Measuring Cancer Pain Interference Pain interference refers to the extent to which pain interferes with important daily activities, such as sleep, mood, and mobility among many others. The most common measure of pain interference in cancer pain research is the Interference scale of the BPI.3 This 7-item scale has shown excellent reliability (internal consistency) across a large number of different samples of patients with cancer pain. At least 2 other multiple-item scales of pain interference have been described in the cancer pain literature (Ripamonti et al, 2000; Maunsell et al, 2000), which have also been shown to have adequate to excellent internal consistency. A number of single-item measures of pain interference have also been used in cancer pain research, and the findings indicate that such measures have good criterion-related validity through their significant associations with measures of quality of life. Moreover, the research indicates that ratings of pain intensity and pain interference can show different patterns of associations with other criterion measures. These findings provide further support for the conceptual and statistical distinction between measures of pain intensity and pain interference. There are a number of important unanswered questions concerning the validity of pain interference scales that should be addressed in future research. Although some interference scales have shown sensitivity to the effects of pain treatment in non-cancer pain patient groups (eg, the BPI25,27), no research has yet determined how well these measures detect changes in cancer pain over time and with effective cancer pain treatment. There is also a paucity of research to determine whether the available pain interference measures differ in terms of their sensitivity to treatment effects. In addition, very few studies have examined the correlates of pain interference in persons with cancer. To what extent do measures of pain interference predict psychologic and physical functioning? What is the relative sensitivity of pain interference measures to pain treatments compared with measures of pain intensity or measures of global functioning? Research that addresses these questions

17 will help to clarify the meaning and importance of pain interference measures among patients with cancer.

Measuring Cancer Pain Relief On the surface, many clinicians or researchers might assume that a rating of pain relief after a treatment represents, or should represent, the same thing as a pretreatment to post-treatment decrease in pain intensity. If this were true, then asking patients to rate pain relief after a treatment for cancer pain could be seen as an alternative to assessing change in pain intensity pretreatment to post-treatment. However, even though pain relief ratings are sensitive to the effects of treatment, pain relief ratings are not always strongly associated with pretreatment to post-treatment changes in pain intensity ratings. Moreover, some patients rate themselves as having experienced relief even when posttreatment pain returns to pretreatment levels. These findings suggest that pain relief ratings should not be interpreted to represent the same thing as pretreatment to post-treatment changes in pain.15 The data from the studies reviewed do suggest the possibility, however, that at least in some populations of patients with cancer, ratings of pain relief may be more sensitive to the effects of pain treatment than pain intensity change scores are. Thus, in situations in which the ability to detect a treatment effect is of utmost importance, for example in clinical trials in which there may be limited power because of a limited number of subjects, researchers would be wise to consider including pain relief ratings as one of the outcome measures in the study. By doing so, they may assess aspects of pain treatment outcome not measured by simple pain intensity change scores. Regarding the selection of pain relief rating scales, the data do not clearly support the use of any one type of relief rating (eg, VRS, NRS, or VAS) over any other. However, practical considerations might suggest that a VRS of pain relief (eg, no relief, slight relief, moderate relief, lots of relief, complete relief) may help limit the chances that patients will confuse the relief rating with pain intensity ratings, because NRS and VAS pain intensity measures can look very similar to NRS and VAS measures of pain relief.

Measuring Cancer Pain Site Very little research has been performed to evaluate and validate measures of pain site in patients with cancer. Of the 2 methods most commonly used to assess pain site in pain research, pain drawing and site checklists, only pain drawings have been studied in patients with cancer. The findings from this research indicate some, but limited (because of the few number of studies), criterion validity for cancer pain drawings, as evidenced by their correspondence with anatomic locations associated with specific cancer diagnoses and through their association with pain intensity. Whether patient-rated pain site measures are useful in patients with cancer remains to be seen. However, re-

18 search among patients with chronic pain not associated with cancer supports pain site as a distinct pain dimension that may play an important role in patient functioning over and above the effects of pain intensity alone. For example, the number of pain sites has been shown to have moderate associations with disability, pain intensity/interference composite scores, and return to work in persons with chronic pain.23,26,29 In another study, pain in the low back area (or low back plus head and neck pain) was associated more strongly with disability than the same intensity of pain in other sites.28 Given the potential importance of pain site (and number of pain sites) to patient functioning, there is a need for further research to examine the validity and usefulness of pain site measures as measures of treatment outcome, pain distribution description, or as predictors of other important outcomes or variables in persons with cancer pain.

Measuring the Temporal Aspects of Cancer Pain Measures of the temporal aspects of pain, including its variability, frequency, and duration, have not received adequate attention in cancer research. The available evidence indicates that measures of pain frequency have shown criterion-related validity through their association with pain intensity and interference composite scores, type of treatment received, and pain affect (the level of “upsetness” caused by pain). In at least one of the studies reviewed, pain frequency was associated with the type of treatment received, whereas the pain intensity rating used in the study was not, suggesting that pain frequency and pain intensity can be considered distinct dimensions of cancer pain. Presence and frequency of breakthrough pain (periods of excruciating pain in the context of ongoing background pain), another important temporal aspect of pain, were similarly shown to be associated with pain interference, as well as psychologic functioning. It is possible, even likely, that temporal aspects of pain such as the frequency and unpredictability of breakthrough pain (or even, alternately, the frequency of pain-free periods) may have an impact on patient functioning over and above any effects of global average pain intensity. It is also possible that cancer pain treatments that impact such variables may have a greater impact on patient quality of life than treatments that focus exclusively on background or baseline pain would. To test these important hypotheses, valid and reliable measures of the temporal aspects of cancer pain are needed. Unfortunately, although the studies that have been performed indicate that pain frequency and variability can be assessed, there is a paucity of research that evaluates the psychometric properties of measures of the temporal aspects of pain or that develops additional measures of this important pain dimension that can then be evaluated.

Cancer Pain Assessment

Measuring the Qualitative Aspects of Cancer Pain Pain is known to have qualities in addition to its intensity. It can be experienced as hot, cold, tingly, deep, dull, worrisome, or any one (or more) of many other qualities. Measures of the qualitative and affective components of pain may be used to more fully describe cancer pain. Such measures could also potentially contribute to improved evaluation and treatment of cancer pain. Given the likelihood that some pain treatments will be found to impact some pain qualities more than others, inclusion of pain quality measures in clinical trials will help determine the specific qualities of pain that would most benefit from each pain treatment that is evaluated.5 Moreover, to the extent that a treatment might impact a relatively few subset of pain qualities, ratings of specific cancer pain qualities may turn out to be more sensitive to the effects of some treatments than ratings of global pain intensity. If so, then systematic use of pain quality measures in clinical trials of cancer pain may help identify effective treatments that might otherwise have been determined to have little effect on pain. The MPQ is the measure most often used to assess the qualitative aspects of pain, including cancer pain. Discriminative validity of the MPQ is evidenced by the moderately strong associations between the MPQ scale scores and measures of pain intensity. These associations are strong enough to indicate that the MPQ scores assess pain but also not so strong to suggest that MPQ scores assess only pain intensity. The findings also show that the MPQ scales are associated with measures of quality of life, and in at least one study were shown to be sensitive to the effects of cancer pain treatment. Evidence supports the validity of the MPQ-Affective subscale, in particular, for assessing pain-related distress, given the stronger associations of this scale with measures of psychologic distress than with measures of pain intensity and the relatively high scores on the MPQ-Affective scale among persons with cancer pain compared with persons with low back pain. However, the MPQ is a relatively lengthy measure (listing 78 descriptors), and many of the descriptors may not be appropriate or needed in patients with cancer pain. Moreover, the MPQ scale scores represent composite measures of multiple pain qualities, so that the MPQ may have limited utility for identifying the effects of treatment on specific cancer pain qualities. The Short-Form MPQ has some strengths that may make it more practical than the MPQ to use in cancer pain research. First, it includes only 15 descriptors instead of 78, markedly reducing the assessment burden on subjects. In addition, it retains descriptors from 2 of the MPQ primary categories (sensory and affective), making it possible to assess these dimensions of pain quality. Finally, unlike the MPQ, which requires patients to select no more than a single word from each of 20 categories of pain, respondents to the SF-MPQ are allowed to rate the severity of each pain descriptor on a 0-3 scale. This allows for scoring and analysis of each specific quality of pain. Unfortunately, only one study has examined the psycho-

CRITICAL REVIEW/Jensen metric properties of the SF-MPQ in patients with cancer. Although this one study found the measure to have excellent internal consistency, it also found a very strong association between the 2 SF-MPQ subscales, raising the possibility that these scales may be assessing a similar underlying construct. Much more research is needed to determine the utility of the MPQ, the SF-MPQ, or even measures of the qualitative aspects of cancer pain not yet evaluated in cancer pain populations (eg, the Neuropathic Pain Scale9) for measuring cancer pain.

Composite Measures of Cancer Pain Intensity and Interference Two quality of life measures commonly used in cancer research, the EOTC QLQ-C30 and the SF-36, include scales that combine pain intensity and pain interference into composite scores. There are strengths and weaknesses to combining these distinct dimensions of pain into single composite scores. Because, typically, measures of pain intensity and pain interference show at least moderate associations with one other, a scale made up of items reflecting these dimensions would be expected to show good internal consistency. Also, combining measures of intensity and interference into a single summary score could reduce the number of variables in analyses involving pain severity as a predictor (independent) or criterion (dependent) variable, thereby potentially increasing the power of statistical analyses in a research study. Such a composite measure also provides clinicians and researchers with an overall indicant of pain and its impact that has an intuitive appeal. Finally, especially for the SF-36 Bodily Pain scale, there are published norms from numerous samples of persons with and without specific health problems to which a patient’s scale score can be compared.31 This makes the interpretation of a patient’s (or groups of patients’) score(s) much easier. The primary weakness of composite measures of pain intensity and interference concerns the loss of information about the relative levels of each pain dimension included in the composite score. Someone with high pain intensity that he or she is not allowing to have an impact on activities and someone with low pain intensity that is interfering substantially with function could obtain the same score on a measure that was an average of these 2 dimensions. Related to this, if a treatment is shown to decrease a composite score of pain intensity and pain interference and only the changes in the composite score are reported in a research study, it is not possible to determine whether the treatment had a greater (or primary) impact on pain interference or pain intensity. Of course, investigators could analyze the pain intensity and pain interference items that contributed to the composite score separately, but unless they do so and report the findings on the individual ratings in their published report, information concerning the impact of treatment on each component dimension will be lost. On the positive side, data support the validity and reliability of both the EOTC QLQ-C30 Pain and the SF-36 Bodily Pain composite scores. The findings indicate that

19 these scales generally have adequate to excellent internal consistencies across samples and cancer pain populations. Also, like measures of pain intensity, these measures of pain have shown appropriate associations with a number of validity criteria, including measures of other dimensions of quality of life, functional status, disease stage, performance status, tumor stage, tumor size, survival, and treatment prognosis. The finding that the QLQ-C30 Pain scale and the SF-36 Bodily Pain scale are very strongly associated with one another (r ⫽ .83) also supports the conclusion that they measure the same underlying dimension(s) of pain. Also, both the QLQ-C30 and SF-36 Pain scales have shown sensitivity to changes in pain over time and with treatment, although one study found that the SF-36 scale was not sensitive to a treatment effect that was detected by changes in a VAS rating of intensity (Holland et al, 1998), suggesting the possibility that pain severity measures may be more useful for providing global descriptions of cancer pain than for assessing treatment outcome.

Proxy Measures of Cancer Pain Several conclusions can be drawn from the line of research that has compared proxy with patient measures of cancer pain. First, the strength of the associations between proxy measures and patient measures is quite variable, with moderate associations being found more often than either weak or strong associations. Second, more often than not, physicians tend to underestimate patient cancer pain and caregivers (including significant others or family members) tend to overestimate patient cancer pain, relative to what patients themselves are reporting. Nurses, perhaps because of their more frequent contact with patients during treatment, are often (but not always) more accurate than either physicians or caregivers when rating patient pain. The findings from these groups of studies suggest that proxy measures of patient cancer pain may carry some valid variance and therefore might provide some indication of patient pain when patients are unable to provide their own ratings of pain experience. However, care must be taken to not conclude that a proxy’s estimate of patient pain is necessarily accurate. In particular, health care providers need to be careful to not assume that their own judgments concerning a patient’s pain intensity will necessarily reflect the patient’s own report; in fact, there is a good chance that the health care provider may be underestimating patient pain levels. There is a need, therefore, to ensure that patients are provided every opportunity to rate their own pain to ensure that accurate ratings are available when making treatment decisions.

Issues of Content Validity As reviewed above, only 4 studies (2.4% of studies reviewed) addressed issues of content validity of cancer pain measures. Evidence concerning measure reliability and criterion validity were included much more often in these studies. This finding may not be particularly sur-

20 prising, in that much of the research reviewed was not specifically performed to address the psychometric properties of the measures. More often than not, findings that spoke to the psychometric properties of the measures examined were presented in the context of a study with other primary goals, such as a clinical trial or a descriptive study. However, the paucity of discussion of and presentation of data that speak to the content validity of cancer pain measures is a concern, given the importance of content validity to measure development and evaluation.2 There is an important need for clinicians and researchers to consider content validity of cancer pain measures, in particular, because of the multidimensional nature of cancer pain. Although it is likely that pain intensity will be one of the dimensions assessed, it is also important that an assessment of pain quality as well as the impact of pain on a person’s life (pain interference) be at least considered as possible secondary measures. As is evident from this review, other pain dimensions, including pain relief, pain site, and the temporal aspects of pain, are rarely considered in cancer pain research. It is possible that these other dimensions are not considered because they are unimportant to patients or to clinicians, and so there is little need to assess these pain dimensions in most situations. However, a more likely explanation for the lack of research that uses measures of these other components of cancer pain is that they are simply overlooked. It is difficult to imagine, for example, that the presence and frequency of breakthrough pain (excruciating pain that occurs in the context of ongoing background pain) are of little importance to the patient experiencing this type of pain. In addition, the number and distribution of pain sites could impact a patient’s quality of life over and above the intensity of average pain. To the extent that these dimensions of pain are not assessed, their associations with important functioning variables will remain unknown, and their potential for impacting quality of life will not be determined. Ultimately, the lack of assessment of these pain dimensions may also contribute to a lack of focus on them as potential outcome variables, limiting the treatment options for patients with cancer pain in multiple sites or with specific temporal patterns.

Limitations of the Current Review This review has several limitations. First, to provide focus for the review, studies were limited to those that studied pain assessment among adults with cancer. Pediatric pain assessment requires consideration of many additional issues not addressed in this review, such as the developmental stage of the person being assessed.19 Many of the conclusions drawn from this review of assessment of pain in adults with cancer therefore do not necessarily apply to the assessment of pediatric cancer pain. The reader is referred to the small, but growing, literature on pediatric cancer pain assessment for information concerning the psychometric properties of available measures.4,5,10,11,17,32 Second, the majority of the studies reviewed in this article were not specifically de-

Cancer Pain Assessment signed to test the psychometric properties of cancer pain measures. The fact that they contained data concerning the reliability and validity of pain measures in persons with cancer was not necessarily evident from their titles or even abstracts. Thus, it is likely that additional articles that provide such data were missed in the current review. However, the generally consistent findings across the studies that were reviewed do provide some support for the conclusions drawn from this review. Third, there are issues relevant to the assessment of pain that were not considered in this review, such as the advantages and disadvantages of assessing pain on a single occasion versus assessing pain on multiple occasions over time, cultural issues relating to pain assessment, and the advantages and disadvantages of using pain intensity cutoffs as inclusion/exclusion criteria in pain treatment outcome studies. Research concerning these issues was not discussed because the focus of the review was on cancer pain assessment specifically and not pain assessment in general. Finally, this review focused primarily on selfreport and proxy measures of cancer pain and did not include discussion of research on observational measures of pain behaviors in patients with cancer. This limitation was due primarily to the fact that most, if not all, of the research on behavioral measures of cancer pain has been performed in pediatric samples.10,11 However, it is possible that behavioral observation measures of cancer pain may prove to be useful for adults as well. Future research is needed to determine the reliability, validity, and utility of observational measures of cancer pain.

Summary and Conclusions A great deal of research has been performed that provides data concerning the psychometric properties of cancer pain measures. The findings from this research support the multidimensional nature of cancer pain. The results also support validity of a number of measures, especially the most commonly used measures of cancer pain intensity and pain interference. Measures of other dimensions of cancer pain, such as pain site and the temporal and qualitative aspects of pain, are less often used and studied. Yet measures of these and other cancer pain dimensions may prove to be invaluable for assessing cancer pain and the efficacy of cancer pain treatment. Future research that develops, refines, and evaluates such measures will provide important information that investigators and clinicians may then use to select specific scales for their research and clinical work. By increasing knowledge about and options for cancer pain assessment, investigators will ultimately contribute to a better understanding and alleviation of cancer pain.

Acknowledgments The author wishes to express his appreciation to Elena Mihailova for her invaluable assistance in typing the study summary tables and to Lisa C. Murphy for her helpful comments on an earlier version of this manuscript.

CRITICAL REVIEW/Jensen

References 1. Aaronson NK, Ahmedzai S, Bergman B, Bullinger M, Cull A, Duez NJ, Filiberti A, Flechtner H, Fleishman SB, de Haes JCJM, Kaasa S, Klee M, Osoba D, Razavi D, Rofe PB, Schraub S, Sneeuw K, Sullican M, Takeda F for the European Organization for Research and Treatment of Cancer Study Group on Quality of Life: The European Organization for Research and Treatment of Cancer QLQ-C30: A quality-of-life instrument for use in international clinical trials in oncology. J Natl Cancer Inst 85:365-376, 1993 2. American Educational Research Association, American Psychological Association, National Council on Measurement in Education: Standards for educational and psychological testing. Washington, DC, American Educational Research Association, 1999

21

16. Lydick E, Epstein RS, Himmelberger D, White CJ: Area under the curve: A metric for patient subjective responses in episodic diseases. Qual Life Res 4:41-45, 1995 17. Manne SL, Jacobsen PB, Redd WH: Assessment of acute pediatric pain: Do child self-report, parent ratings, and nurse ratings measure the same phenomenon? Pain 48:45-52, 1992 18. Margolis RB, Tait RC, Krause SJ: A rating system for use with patient pain drawings. Pain 24:57-65, 1986 19. McGrath PA, Gillespie J: Pain assessment in children and adolescents, in Turk DC, Melzack R (eds): Handbook of Pain Assessment (ed 2). New York, NY, Guilford Press, 2001, pp 97-118 20. Melzack R: The McGill Pain Questionnaire: Major properties and scoring methods. Pain 1:277-299, 1975

3. Cleeland CS, Ryan KM: Pain assessment: Global use of the Brief Pain Inventory. Ann Acad Med 23:129-138, 1994

21. Melzack R: The short-form McGill Pain Questionnaire. Pain 30:191-197, 1987

4. Collins JJ, Byrnes ME, Dunkel IJ, Lapin J, Nadel T, Thaler HT, Polyak T, Rapkin B, Portenoy RK: The measurement of symptoms in children with cancer. J Pain Symptom Manage 19:363-377, 2000

22. Nunnally JC: Psychometric Theory (ed 2). New York, NY, McGraw-Hill, 1978

5. Collins JJ, Devine TD, Dick GS, Johnson EA, Kilham HA, Pinkerton CR, Stevens MM, Thaler HT, Portenoy RK: The measurement of symptoms in young children with cancer: The validation of the Memorial Symptom Assessment Scale in children aged 7-12. J Pain Symptom Manage 23:10-16, 2002 6. Du Pen SL, Du Pen AR, Polissar N, Hansberry J, Kraybill BM, Stillman M, Panke J, Everly R, Syrjala K: Implementing guidelines for cancer pain management: Results of a randomized controlled clinical trial. J Clin Oncol 17:361-370, 1999 7. Farrar JT, Portenoy RK, Berlin JA, Kinman JL, Strom BL: Defining the clinically important difference in pain outcome measures. Pain 88:287-294, 2000 8. Farrar JT, Young JP, LaMoreaux L, Werth JL, Poole RM: Clinical importance of changes in chronic pain intensity measures on an 11-point numerical rating scale. Pain 94:149158, 2001 9. Galer BS, Jensen MP: Development and preliminary validation of a pain measure specific to neuropathic pain: The Neuropathic Pain Scale. Neurology 48:332-338, 1997 10. Gauvain-Piquard A, Rodary C, Rezvani A, Lemerle J: Pain in children aged 2-6 years: A new observational rating scale elaborated in a pediatric oncology unit—Preliminary report. Pain 31:177-188, 1987 11. Gauvain-Piquard A, Rodary C, Rezvani A, Serbouti S: The development of the DEGR(R): A scale to assess pain in young children with cancer. Eur J Pain 3:165-176, 1999 12. Gracely RH: Evaluation of multi-dimensional pain scales. Pain 48:297-300, 1992 13. Jacox A, Carr DB, Payne R: Management of Cancer Pain. Clinical Practice Guideline No 9. AHCPR Publ, No 94-0592. Rockville, MD, Agency for Health Care Policy and Research, US Department of Health and Human Services, Public Health Service, 1994 14. Jamison RN, Brown GK: Validation of hourly pain intensity profiles with chronic pain patients. Pain 45:123-128, 1991 15. Jensen MP, Chen C, Brugger AM: The relative validity of three pain treatment outcome measures in post-surgical pain. Pain 99:101-109, 2002

¨ hlund C, Eek C, Palmblad S, Areskoug B, Nachemson A: 23. O Quantified pain drawing in subacute low back pain. Spine 21:1021-1031, 1996 24. Serlin RC, Mendoza TR, Nakamura Y, Edwards KR, Cleeland CS: When is cancer pain mild, moderate or severe? Grading pain severity by its interference with function. Pain 61:277-284, 1995 25. Shiffmann R, Kopp JB, Austin HA, Sabnis S, Moore DF, Weibel T, Balow JE, Brady RO: Enzyme replacement therapy in Fabry disease: A randomized controlled trial. JAMA 285: 2743-2749, 2001 26. Tait RC, Chibnall JT, Margolis RB: Pain extent: Relations with psychological state, pain severity, pain history, and disability. Pain 41:295-301, 1990 27. Thie NMR, Prasad NG, Major PW: Evaluation of glucosamine sulfate compared to ibuprofen for the treatment of temporomandibular joint osteoarthritis: A randomized double blind controlled 3 month clinical trial. J Rheumatol 28:1347-1355, 2001 28. Toomey TC, Gover VF, Jones BN: Spatial distribution of pain: A descriptive characteristic of chronic pain. Pain 17: 289-300, 1983 29. Toomey TC, Mann JD, Abashian S, Thompson-Pope S: Relationship of pain drawing scores to ratings of pain description and function. Clin J Pain 7:269-274, 1991 30. Tursky B, Jamner LD, Friedman R: The pain perception profile: A psychophysiological approach to the assessment of pain report. Behav Ther 13:376-394, 1982 31. Ware JE, Snow KK, Kosinski M, Gandek B: SF-36 Health Survey: Manual and Interpretation Guide. Lincoln, RI, QualityMetric Incorporated, 2000 32. West N, Oakes L, Hinds PS, Sanders L, Holden R, Williams S, Fairclough D, Bozeman P: Measuring pain in pediatric oncology ICU patients. J Pediatr Oncol Nurs 11:64-68, 1994 33. World Health Organization: Cancer Pain Relief. Geneva, Switzerland, World Health Organization, 1986 34. World Health Organization: Cancer Pain Relief and Palliative Care (World Health Organization Technical Report Series 804). Geneva, Switzerland, World Health Organization, 1990

Suggest Documents