The Inflammatory Bowel Disease Questionnaire (IBDQ) is. The Inflammatory Bowel Disease Questionnaire. A Review of Its National Validation Studies

ORIGINAL CONTRIBUTION The Inflammatory Bowel Disease Questionnaire A Review of Its National Validation Studies A. G. Pallis, MD,* I. A. Mouzas, MD,* ...
Author: Sara Knight
12 downloads 0 Views 93KB Size
ORIGINAL CONTRIBUTION

The Inflammatory Bowel Disease Questionnaire A Review of Its National Validation Studies A. G. Pallis, MD,* I. A. Mouzas, MD,* and I. G. Vlachonikolis, MA, DPhil†

Abstract: Health-related quality of life (HRQoL) is an important measure of illness perception on the part of the patient. The Inflammatory Bowel Disease Questionnaire (IBDQ) is a widely used questionnaire for HRQoL assessment in patients with inflammatory bowel diseases (IBDs). This questionnaire has been adapted and validated into several languages and cultural milieus. The aim of this study is to review the methods used by several adaptation studies for assessing the validity and reliability of the adapted IBDQ. A search was made of the Medline database for relevant articles since 1989. Standard validation criteria were used for including studies for further evaluation. The following aspects of the validation procedure were examined: translation, construct validity, reliability, sensitivity to change, and used statistical methods. Nine validation studies of the IBDQ, in England and in non–English-speaking countries (Holland, Spain, Korea, Sweden, Greece, and China) were selected. All studies concluded that the adapted instrument was valid and reliable. Only few modifications were proposed. Two studies recommended the split of the four dimensions of the original questionnaire in five. Assessing HRQoL in patients with IBD is an ever-increasing practice, especially in clinical trials. IBDQ was proven to be valid and reliable in several cultural and linguistic milieus when appropriate validation procedures were applied. Key Words: Crohn’s disease, health-related quality of life, inflammatory bowel disease, inflammatory bowel disease questionnaire, ulcerative colitis (Inflamm Bowel Dis 2004;10:261–269)

T

he Inflammatory Bowel Disease Questionnaire (IBDQ) is a widely used questionnaire for health-related quality of life (HRQoL) assessment in patients with inflammatory bowel diseases [ulcerative colitis (UC) and Crohn’s disease (CD)]. It was developed by Guyatt et al1 as a physician-administered questionnaire regarding the patient’s status during the last 2 weeks before administration. It consists of 32 questions diReceived for publication January 11, 2003; accepted August 20, 2003. From the *Department of Gastroenterology, University Hospital of Heraklion, Heraklion, Crete; and †Department of Biostatistics, University of Crete, Heraklion, Greece. Reprints: Ioannis A. Mouzas, Gastroenterology Department, University Hospital of Heraklion, P.O. Box 1352, 71110 Heraklion, Crete, Greece (e-mail: [email protected]). Copyright © 2004 by Lippincott Williams & Wilkins

Inflamm Bowel Dis • Volume 10, Number 3, May 2004

vided into four dimensions: bowel symptoms (10 items), systemic symptoms (5 items), emotional function (12 items) and social function (5 items). Every question has graded responses from 1 (worst situation) to 7 (best situation), and thus the total score is ranging from 32 to 224 with higher scores representing better quality of life.1,2 The original IBDQ is proven to be a valid and reliable assessment tool that reflects important changes in the quality of life of patients with IBD.3 A self-administered version and a short form of the questionnaire have been validated and were also found valid and reliable.4,5 The IBDQ was originally developed in a cohort of Canadian patients.1 It is well-known however, that patients’ responses to such instruments are dependent on underlying cultural trends, so that translation alone to another language is not sufficient for its use. It is thus recommended that after translation and back-translation the new adaptation should be reexamined in relation to its validity, reliability, and sensitivity to change in the new language and cultural context. The aim of this study is to focus on the methods used by several adaptation studies for assessing the validity of the IBDQ.

MATERIALS AND METHODS A search of the Medline database was conducted for published papers from 1989, with the keywords “IBDQ,” “quality of life,” and “validation.” The search was limited to articles written in English and to those referring to adult patients. The following two criteria were used for the selection of studies6–8: (1) validity, which is the extent to which the questionnaire measures what it purports to measure (HRQoL). As there is no “gold standard” for HRQoL, the aspect of validity of IBDQ, which is usually assessed, is the construct validity. This is the ability to correlate with other generally accepted “proxy” measures (eg, other independently validated general health instruments, visual analog scales, or disease activity indexes) and to relate the scores to a hypothesis (eg, to give similar or different scores for groups with similar or different clinical status, respectively). The latter example is known as known-groups comparison or discriminant ability. (b) Reliability, which is the degree of agreement (consistency) between two administrations of the IBDQ to the same patients under similar conditions (general and clinical) at two different

261

Inflamm Bowel Dis • Volume 10, Number 3, May 2004

Pallis et al

occasions; this is called test–retest reliability or reproducibility. Also, the ability to produce consistent results across items in the same dimension (purporting to measure the same trait). This is called internal validity or internal consistency. The selected studies were also assessed with respect to two other important aspects of the validation procedure: (1) translation; and (2) sensitivity to change (or responsiveness). More specifically, for the adaptation of a questionnaire in a new cultural context, the first step should be the translation in the native language and the back-translation to the original one. The back-translation is then compared with the original instrument and ideally no significant differences should exist.9 The sensitivity to change refers to the ability of the questionnaire to detect clinically important changes over time; it is evaluated by comparing the results at baseline to those of the follow-up visit for patients who reported a change in their condition. Except for the translation/back-translation, all the other aspects of the validation procedure require the application of statistical methods. For example, internal validity is usually assessed by: factor or principal component analysis of the 32 IBDQ items; examination of correlation coefficients (CC) between items (item-item) and dimensional scores (itemdimensional total); and summary correlation measures such as Cronbach’s alpha (mean item-item correlation weighted by the total variance). These statistical methods require homogenous samples of adequate sample size and appropriate use of parametric/non-parametric or multivariate techniques. The selected studies were also assessed with respect to use of statistical methods.

RESULTS According to the two criteria of validity and reliability, nine adaptation studies were selected (Table 1). They included

the adaptation of IBDQ in England10,11 and in non–Englishspeaking countries, such as Holland,12,13 Korea,14 Spain,15 Sweden,16 Greece,17 and China.18 The first Dutch study12 involved also some modifications (the main one in changing the grading of responses from seven-point to five-point scales). The version of the IBDQ validated to Spanish15 was the 36item version19 with five dimensions. Another British recent adaptation, the UK IBDQ, was also included.11

Patients The original McMaster IBDQ1 was developed in a cohort of IBD patients (UC and CD) and assessed with respect to reliability and sensitivity to change. A further validation (including an assessment of validity) was carried out later by the same group using a larger sample of patients with CD.20 Two of the adaptation studies used only patients with UC,10,16 whereas the remaining seven used patients with UC and CD. The sample size of each study and the method of administration are presented in Table 1.

Translating Procedures All seven studies of adaptation to non-English languages reported details of the translation.12,13,15–18 These studies used the standard translation/back-translation method, except for the Korean study,14 which did not report information about back-translation. In most of the studies, a panel of physicians translated the English IBDQ, whereas an official bilingual translator did the back-translation. Only a few minor modifications were reported as necessary in the translations; the main ones were changes to make certain items applicable for patients with ileostomy bags,12 whereas the Anglicized version11 modified the wording of the questions and reduced the responses of all items from seven to four.

TABLE 1. Validation Studies of the Inflammatory Bowel Disease Questionnaires: Number of Patients Included in Each Study, Method of Administration, Country of Adaptation, and Year of Publication No. of Patients UC De Boer et al Russel et al Han et al Kim et al Lopez-Vivancos et al Cheung et al Hjortswang et al Pallis et al Leong et al

CD

128 143 49 71 28 98 49 119 97 Validity: 180* Reliability: 100* 300 69 45 76 59

Method of Administration

Language

Year of Publication

Self-administration Interview Interview & Self-administration Interview Self-administration Self-administration

Dutch Dutch English Korean Spanish English

1995 1997 1998 1999 1999 2000

Self-administration Interview Self-administration

Swedish Greek Chinese

2001 2001 2003

UC, ulcerative colitis; CD, Crohn’s disease. *Numbers of patients with UC or CD were not given.

262

© 2004 Lippincott Williams & Wilkins

Inflamm Bowel Dis • Volume 10, Number 3, May 2004

Construct Validity All studies assessed the construct validity of the adapted IBDQ using similar “proxy” measures and hypotheses relating the IBDQ scores to these measures via correlation analyses or known-group comparisons. Thus, in the first Dutch validation of IBDQ,12 three general (non–disease-specific) questionnaires were used. A modified version of the MOS (Medical Outcome Studies) scale, the MOS-24 questionnaire,21,22 which is a general health instrument assessing several dimensions, the Center for Epidemiologic Studies Depression scale (CES-D),23 and a separate measure of social support, the MOS social support survey.24,25 For disease activity, this study used a graded response for both UC and CD (complete remission, incomplete remission, moderate relapse, severe relapse) rated by the physicians. Several other measures were also used: health care use (number of hospital admissions and gastroenterologist or general practitioner visits) and medication use (yes/no) of sulfasalazine, mesalazine, and prednisone. All four IBDQ dimensional scores were highly correlated (Pearson CC) with the “conceptually” related dimensions of the three general indices (all p < 0.001) and significantly different between the four categories of disease activity (one-way analyses of variance; all p < 0.001). Similarly, they were significantly correlated with the measures of health care use (p < 0.01), except for the correlation between bowel symptoms and hospital admissions. In contrast, there was no significant correlation with use of medication, except for bowel symptoms and use of mesalazine. For the latter two correlation studies, both Pearson and non-parametric correlation coefficients were used. The second Dutch validation study13 used two general assessments of well-being, the SF-36 questionnaire26 and a Visual Analog Scale (VAS) scaled from 0 to 100. A separate VAS was used for the assessment of the degree of bowel function. Disease activity was assessed by the Harvey-Bradshaw index (HBI)27 for patients with CD and the Ulcerative Colitis Index (UCI) for patients with UC.28 All correlations (Pearson CC) between the four IBDQ dimensional scores and SF-36, VAS-general, VAS-bowel function and disease activity, were significant in both groups of UC and CD patients. With respect to disease activity, the correlation coefficients with the IBDQ dimensions social function and systemic symptoms were lower in patients with UC than in patients with CD. In addition, HBI, UCI, and the VAS scale for bowel function were also used to define known-groups for comparison; two groupings were defined: (1) HBI > 6 or UCI > 3, and (2) VAS > 55% indicated in both groupings “considerable bowel complaints,” otherwise “no or minor complaints.” For both groupings, all four IBDQ dimensional scores were significantly different between patients in the two categories of bowel complaints above (Mann-Whitney U test). Similarly, the SF-36 and a disease activity index29 were used for the assessment of construct validity in the English © 2004 Lippincott Williams & Wilkins

adaptation of IBDQ.10 All correlation coefficients (Pearson CC) between the four IBDQ dimensions and the conceptually related dimensions of SF-36 were significant. Similarly, the correlation coefficients with CAI were significant; stronger with bowel and systemic dimensions than with the emotional and social ones. It must be noted that as the sample size of this study was small (n = 28), the observed significant correlations indicate the strength of the underlying associations. In the Korean validation study,14 only disease activity was used as a “proxy” measure for the assessment of construct validity of the adapted IBDQ. For this, the St. Mark’s index30 was used for patients with UC and the Crohn’s Disease Activity Index (CDAI)31 and HBI27 for patients with CD. All correlation coefficients (Pearson CC) between the four IBDQ dimensions and disease activity indexes were significant in both disease groups (all p < 0.001). The correlation appeared to be stronger with bowel and systemic dimensions than with the emotional and social ones. In the Spanish adaptation of the IBDQ,15 two general assessments of well-being were used: the Psychologic General Well-Being Index (PGWBI)32 and the EuroQol33; the latter includes a “preference value” and a VAS. For disease activity, two indices were used: the Rachmilewitz index34 for UC and the HBI27 for CD. Total and dimensional IBDQ scores correlated highly and significantly with all three other measures, PGWBI, EuroQol preference value and VAS, in both UC and CD (Spearman CC). With respect to PGWBI, the highest correlation was observed for the systemic dimension and the lower for the emotional function in both disease groups. Correlations between the IBDQ and disease activity were significant and similar for UC and CD. The correlation with the bowel dimensional score was the highest, whereas those with the systemic and the social dimensions were lower. In addition, the disease activity indices were also used to define knowngroups for comparison: remission, mild activity and moderatesevere activity. For both diagnoses, total and dimensional IBDQ scores were significantly different between patients in the three categories of disease activity above (Mann-Whitney U test). The second English adaptation11 used also SF-36 and empirical assessments of disease activity. Note that in this study, the assessment of internal consistency yielded five IBDQ dimensions; the bowel symptoms were split in two dimensions: I (bowel movements and use of facilities) and II (general bowel symptoms). The correlation (unspecified type but probably Pearson CC) between the five dimensional scores and the SF-36 dimensions showed similar significant patterns, although those of the two bowel function dimensions were in general lower than the other three dimensions, emotional, social and systemic. Disease activity at recruitment was used to define two groups of patients (stable/mild versus moderate/severe disease). All the IBDQ dimensional scores were significantly or markedly different between the two

263

Pallis et al

groups (independent groups t tests); only the difference in systemic and bowel II dimensions were marked but not significantly different (p = 0.076 and p = 0.059, respectively). There were 62 patients who relapsed during the study; the comparison (paired t tests) between their scores at recruitment (stable/mild/moderate/severe disease) and at relapse showed significant results for all IBDQ dimensions, except for bowel II (general bowel symptoms). In the Swedish study, IBDQ scores16 were compared with three general assessments, SF-36, the Rating Form of IBD Patient Concerns (RFIPC)35 and the PGWBI.32 Disease activity was based on the physician’s assessment of patients being in remission or relapse and the degree of inflammatory changes at sigmoidoscopy.36,37 Additionally, patients provided information about frequency of symptoms during the last 7 days: loose stools/24 hours, visible blood in stools, and abdominal pain. All correlation coefficients (Spearman CC) between the four IBDQ dimensions and the conceptually related dimensions of SF-36 were moderate to high. Similarly, there was adequate correlation between all IBDQ dimensional scores and RFIPC, ranging from –0.43 for social function to –0.65 for emotional function. PGWBI correlated highly with the systemic and emotional dimensions of the IBDQ (r = 0.76 and 0.81, respectively). All correlation values between IBDQ dimensional scores and disease activity measures were high, and as expected highest with the IBDQ bowel symptoms dimension. Similarly, patients in relapse scored significantly lower than patients in remission on the IBDQ sum score as well as on the four dimensional scores (all p < 0.0001, MannWhitney U test). In the Greek adaptation of the IBDQ,17 two general health assessments were used: SF-36 and a VAS for general well-being graded from 1 (worst) to 7 (best). For disease activity the HBI27 and CAI29 were used for patients with CD and UC, respectively. Correlation coefficients (Spearman CCs) between all four IBDQ dimensions and the SF-36 total and dimensional scores were positive and significant (all p < 0.001); higher in the group of patients with CD than UC. Similarly, correlation coefficients between all four IBDQ dimensions and the VAS of general well-being were positive and highly significant; they varied between 0.82 and 0.88 in patients with CD and between 0.62 and 0.85 in patients with UC (all p < 0.001). Regarding the disease activity indices, correlation coefficients with the four IBDQ dimensions were negative and highly significant, ranging between −0.67 to −0.71 in patients with CD and between –0.61 and –0.64 in patients with UC. In addition, HBI, CAI, and the VAS scale were also used to define known-groups for comparison; two groupings were defined: (1) HBI ⱖ 6 or CAI ⱖ 10 indicated “severe symptoms,” otherwise “no or minor symptoms,” and (2) VAS ⱕ 3 indicated “considerably unwell,” otherwise “well.” For both groupings, all four IBDQ dimensional scores were significantly different between patients in the two categories of disease activity or

264

Inflamm Bowel Dis • Volume 10, Number 3, May 2004

VAS above (all p < 0.001, Mann-Whitney U test). Furthermore, a multivariate comparison using discriminant analysis38 showed that emotional and social function had no significant discriminatory power (for both groupings) once the other two dimensions, systemic and bowel symptoms were taken into account. This result was consistent in both diagnostic groups. Finally, Leong et al18 correlated the Chinese IBDQ with disease activity indices29 for UC patients and CDAI31 for CD patients, the previously validated Chinese SF- 36, and VAS. The Chinese version of the IBDQ correlated significantly with the SF- 36 score (all four IBDQ dimensions, p < 0.001) for both disease groups. The correlation was strongest for the systemic dimension (Spearman correlation coefficients), followed by the emotional dimension, the bowel function dimension, and the social dimension. For all Chinese IBDQ dimensions, the correlation with SF-36 was stronger for CD than for UC. Correlation was also positive and significant with CDAI and CAI, ranging from 0.623 to 0.719 for CD patients and from 0.440 to 0.675 for UC patients, respectively (for all four dimensions, p < 0.001). Again, the correlation scores were stronger for CD than for UC. The Chinese IBDQ dimensional scores correlated with VAS better for CD (all four dimensions, p < 0.001), but less well for UC. For UC, the bowel function and social dimensions did not produce statistically significant correlations (p > 0.05).18

Reliability Only five of the nine studies assessed the two main aspects of reliability that is test–retest reliability and internal validity. The first Dutch study12 assessed only the internal validity of the adapted IBDQ using Cronbach’s alpha coefficient. All alphas (for the dimensional and total scores) were larger than 0.70, indicating good degree of internal validity. No assessment of individual items was accommodated (comparing the scales’ alpha with and without the items in question). This study did not assess the test–retest reliability. The second Dutch study13 assessed only the test–retest reliability. For this, patients were asked to complete the IBDQ twice, at an interval of 4–6 weeks. Seventy-five of the 120 participant patients (CD n = 48 and UC n = 27) remained stable during the intervening period according to a “transition question” regarding bowel complaints. The intraclass correlation coefficients (ICC) between scores of these 75 patients at the two occasions (proportion of total variability due to variability between subjects) were high for all IBDQ dimensional scores, ranging from 0.75 to 0.93. Similarly, none of the differences between baseline and second administration were statistically significant. Both statistical assessments were carried out using analysis of variance for repeated measures. The English adaptation of IBDQ10 assessed both aspects of reliability, test–retest reliability, and internal validity. For the test–retest reliability, patients were asked to complete the IBDQ three times over a 4-week period (a 2-week interval be© 2004 Lippincott Williams & Wilkins

Inflamm Bowel Dis • Volume 10, Number 3, May 2004

tween administrations); the first and third were by interview and the second self-completed by post. Patients were not expected to change significantly during the duration of the study, but no evidence of this was given in the results. All ICC for the IBDQ dimensional scores, between any two occasions and all three occasions were high. Between first (interview) and second (self-completed), they ranged from 0.73 to 0.93, between all three occasions from 0.71 to 0.88, and were slightly lower between first and third (both by interview but 4-weeks apart), ranging from 0.62 to 0.79. Similarly, the IBDQ dimensions showed high internal validity, with corrected item-total correlations mostly greater than 0.4 and Cronbach’s alpha values greater than 0.70 (range, 0.72 to 0.89). The internal validity was similar whether the IBDQ was interviewer- or selfadministered. The Spanish adaptation of the IBDQ15 reported only Cronbach’s alpha (internal validity) for the total IBDQ score; 0.96 for both UC and CD. For the assessment of test–retest reliability, patients were asked to complete the IBDQ twice (intervening interval was not reported). One hundred thirty patients (CD n = 58 and UC n = 7 2) remained stable. There were no significant differences between the test and retest scores, and the ICC for the total and dimensional IBDQ scores were high for both UC and CD. [The ICC for total score were 0.82 (UC) and 0.96 (CD), whereas for the dimensional scores they ranged from 0.70 to 0.81 (UC) and from 0.67 to 0.85 (CD).] The second English adaptation11 used a separate sample of patients (n = 100) for the assessment of reliability. As mentioned earlier, principal component analysis was used for the assessment of internal validity; it yielded five IBDQ dimensions with the bowel symptoms split in two dimensions. Furthermore, none of the items in the five dimensions had a corrected item-dimensional total correlation of less than 0.4 and Cronbach’s alpha values ranged from 0.78 to 0.86. For the test–retest reliability, patients were asked to complete the IBDQ twice, at an interval of 2 weeks. Only 61 returned both the test and retest questionnaire. Forty-five did not report changes of their bowel condition. The IBDQ and its five dimensions were highly reproducible for these stable patients: their test and retest scores were no significantly different (paired t tests) and their ICC were very high; 0.97 for the total score, whereas for the dimensional scores they ranged from 0.87 to 0.96. The Swedish study16 thoroughly assessed both aspects of reliability. For internal validity they carried out a factor analysis, and several “item-item” and “item-dimension totals” correlation analyses (Spearman CC) for assessing several aspects (some overlapping), such as convergent (item and own dimension agreement) and discriminant validity (item and other dimensions agreement), dimension homogeneity (correlations between items within each dimension), and interdimension correlation. Factor analysis yielded six factors, convergent validity was satisfactory, but discriminant validity was © 2004 Lippincott Williams & Wilkins

poor, interdimension correlations were high (especially between the emotional dimension and systemic and bowel symptoms), and dimension homogeneity was low for all four dimensions and especially for the total score. The test–retest reliability was assessed for 75 patients with stable remission during a 6-month prior (total follow-up). Correlation coefficients (Spearman CC) between the test and retest scores were high (0.78 for the total score) for three dimensions (0.71–0.76), but low for social function (0.40). There were no significant changes in this sample in the total or any dimensional score (Wilcoxon signed rank test). The Greek study17 assessed only the test–retest reliability. For this a subgroup of 46 patients, were asked to complete the IBDQ twice, at a second interview 6–8 weeks after the first. Twenty-eight patients (CD n = 14 and UC n = 14) remained stable during this period according to a “transition question” regarding general well-being state. The ICC were high for all IBDQ dimensional scores, close to one. Similarly, none of the differences between the two administrations were statistically significant (paired t test and Wilcoxon signed rank test). Similarly, the Chinese study18 assessed only test–retest reliability. Seventy-six patients with stable disease (change in CDAI < 50 or CAI < 4) completed the questionnaire for a second time, at a mean of 20 days apart, and paired results were compared with the Wilcoxon signed rank test and the ICC. The Wilcoxon signed rank test for all four dimensions showed no statistical difference. Correspondingly, the ICC was also very high for both CD and UC, for all four dimensions. Finally, the Korean study14 did not report any results of a reliability assessment.

Sensitivity to Change Only six of the nine studies assessed the sensitivity to change of the adapted IBDQ. The second Dutch study13 reported results for 45 participant patients who either had improvement in bowel complaints (n = 35) or deterioration (n = 10) during the intervening period of the two administrations (4–6 weeks). All four IBDQ dimensions were significantly different between the two administrations (all p < 0.001; paired t tests) in patients with CD, but only the bowel symptoms dimension was different (p < 0.001) in patients with UC. The Spanish study15 reported results for 33 patients who either relapsed (n = 12) or entered remission (n = 19) during the study; the median time between the two administrations was 13 weeks. All IBDQ scores, total and dimensional, were significantly different between the two administrations (all p < 0.01; Wilcoxon signed rank test) in both UC and CD patients of either group according to outcome change to relapse or to remission. Standardized effect sizes (mean change in score between the two administrations divided by the standard deviation of the first) among patients who relapsed were higher for patients with CD than for patients with UC (for the totals score 8.04 and 1.7, respectively). In contrast, the effect sizes were

265

Inflamm Bowel Dis • Volume 10, Number 3, May 2004

Pallis et al

similar in patients who entered remission (for the total score −1.81 and –1.88 for patients with CD and UC, respectively). In the Anglicized version of IBDQ,11 15 patients of the 61 who returned both test and retest questionnaires, reported a change in the general rating of their bowel condition. Significant differences were found only for the total IBDQ score (p = 0.024) and the two bowel function dimensions (p = 0.045 and p = 0.026, respectively), whereas that for the social function was marked (p = 0.069). Similarly, the Swedish study16 assessed the sensitivity to change for 39 patients, who either relapsed (n = 15) or entered remission (n = 24) between baseline and the follow-up (at 6 months). The IBDQ total and its four dimensional scores were significantly different between the two administrations (all p < 0.01; Wilcoxon signed rank test). In the Greek study,17 18 patients reported improvement (n = 15) or deterioration (n = 3) of general well-being between the two interviews (6–8 weeks). All four IBDQ dimensions were significantly different between the two administrations (all p < 0.001; paired t test and Wilcoxon signed rank test). Finally, the Chinese IBDQ study18 assessed paired Chinese IBDQ scores in patients who showed a change in disease activity (change in CDAI > 50 or CAI > 4). The Wilcoxon signed rank test demonstrated a significant difference between the paired Chinese IBDQ scores in all four dimensions (bowel function dimension, p = 0.005; social dimension, p = 0.007; systemic dimension, p = 0.019; emotional dimension, p = 0.041).

DISCUSSION In this critical review, we evaluated nine validation studies of IBDQ adaptations in seven languages. In almost all studies IBDQ was confirmed to be a reliable, valid, and sensitive instrument for assessing HRQoL. From the part of the physician, the main therapeutic goal in inflammatory bowel disease is the achievement of long remission phases, as in other chronic diseases. From the patient’s perspective, however, IBD may have a significant impact on everyday life for a long period of time, sometimes irrespective from disease activity. This fact makes imperative the development of methods for accurate assessment of global and disease-specific quality of life. Therefore, the need for a valid and reliable instrument to assess HRQoL in IBD is well founded, particularly when the patient’s needs are considered in the first place. Thus, the accurate assessment of HRQoL plays an important role not only in clinical trials or health research surveys, but also in the therapeutic management of the individual patient. The first step in adapting a useful instrument such as the IBDQ in a non–English-speaking culture concerns aspects regarding the translation procedures. To be successful, translations from one language/culture to another must use terms referring to real issues and experiences that are familiar in both

266

cultures, if not exactly similar. If cultures differ greatly in the nature of their overall way of life it will be difficult to achieve equivalence (semantic and conceptual) of linguistic statements no matter how carefully the translation is done. The backtranslation is necessary for achieving this equivalence as it allows the comparison with the original instrument and indicates changes until agreement is reached. However, in the existing adaptations, only a few minor modifications were reported as necessary in the translations. Details regarding translating procedure were reported in all seven studies of adaptation to non-English languages.12–18 The standard translation backtranslation was used in six studies.12,13,15–18 Only a few minor modifications were reported as necessary in the translations; although the English study (Anglicization of IBDQ)11 indicated the need for modifications in the wording of the questions. Two studies11,12 reduced the responses of all items from seven to four11 and five,12 respectively. DeBoer et al12 state that five-point Likert scales are preferred over seven-point scales in questionnaires, because of better reliability and consistency, whereas Cheung et al11 report that the reduction was made for simplicity reasons, but without any further clarification. Additionally, Cheung et al11 based on principal component analysis, excluded two questions from the UK-IBDQ. In our opinion any structural changes of the original IBDQ (reduction of responses options, exclusion of items) renders it difficult to compare results across different patient populations and weakens the instrument when used in multicenter clinical trials. All studies evaluated construct validity of the modified instrument, by using the so called “proxy” measures, such as generic instruments (SF-36, MOS-24, PGWBI), visual analog scales and disease activity indexes or by testing specific hypotheses about the IBDQ scores. This was carried out via correlation analyses or known-group comparisons, respectively. All studies used widely accepted and reliable “proxy” measures (Table 2), except for the Korean study,14 which used only disease activity as a proxy measure. This was a serious limitation of this study, as patients’ discomforts in IBD are caused not only by intestinal and extraintestinal inflammation and complications but also by the effects of disease on self-image, psychological function, and social impairment. Disease activity indexes have serious limitations in measuring social and emotional impairment caused by these chronic illnesses. Additionally, in most studies10,13–18 the reported correlation of disease activity indexes with the bowel symptoms dimension of the IBDQ was higher than with the social and emotional dimensions. The usage of “health care use” as a “proxy” measure for HRQoL by de Boer et al12 can also be debated. In a similar context, Drossman et al35 have used similar measures; they reported correlations between hospitalizations and quality of life of patients with IBD. It can be argued, however, that health care use is possibly more determined by other factors © 2004 Lippincott Williams & Wilkins

Inflamm Bowel Dis • Volume 10, Number 3, May 2004

TABLE 2. “Proxy” Measures Used for the Assessment of Construct Validity of the Adapted Questionnaires

De Boer et al*

Russel et al

Generic Instruments

Disease Activity Indexes

Visual Analog Scale

Other

MOS-24 CES-D MOS social support survey SF-36





Health care use

HBI UCI CAI CDAI, HBI, St. Mark’s Activity Index HBI, Rachmilewitz Index Empirical index Empirical index

General well-being Bowel condition — —







— —

— —

HBI, CAI

Gerneral well-being



CDAI, CAI

General well-being Bowel function Emotional health Social health



Han et al Kim et al*

SF-36

Lopez-Vivancos et al

PGWBI EuroQoL SF-36 SF-36 PGWBI RFIPC SF-36 VAS SF-36 VAS

Cheung et al Hjortswang et al

Pallis et al Leong et al



— —

MOS, Medical Outcome Studies; CES-D, Center for Epidemiologic Studies Depression scale; SF-36, Short Form 36 questionnaire; HBI, Harvey-Bradshaw Index; UCI, Ulcerative Colitis Index; CAI, Colitis Activity Index; CDAI, Crohn’s Disease Activity Index; PGWBI, Psychological General Well-Being Index; EuroQol, Euro Quality of Life Questionnaire; RFIPC, Rating Form of IBD Patient Concerns; VAS, Visual Analog Scale. *Only one type of “proxy” measure used.

such as health policy, social and economical status, cultural context, or lack of geographical proximity. The ability of the adapted versions of the IBDQ to relate with clinically important differences was demonstrated by the significant differences in scores between patients with disease in remission and patients with disease in relapse, in seven studies.11–13,15–17 Although significant differences were expected for the bowel and systemic symptoms dimensions, patients were also found to differ significantly on the emotional and social dimensions too. The only exceptions were the UKIBDQ study,11 where no difference was found for the systemic dimension, and the Chinese version of the IBDQ, where a trend toward significance was detected for the bowel symptoms dimension, in UC patients.18 The only study to assess the relative importance of the four dimensions in distinguishing between two groups with important clinical differences was the Greek one.14 A multivariate comparison by discriminant analysis showed that emotional and social function had no significant distinguishing power once the other two dimensions, systemic and bowel symptoms, were taken into account. This is an important observation leading to the conclusion that at least © 2004 Lippincott Williams & Wilkins

in the Greek IBD patients the bowel and systemic symptoms dimensions are those predominantly involved in the difference of HRQoL of patients in the two groups. Emotional and social are also involved but to a smaller, less significant extent. Reliability has two main aspects; that is, test–retest reliability and internal validity (Table 3). All studies assessed the test–retest reliability, with the exception of the Korean study,14 which did not report any results of reliability measurement. In six studies,11,13,15–18 test–retest reliability was tested by comparing all four IBDQ dimensional scores between consecutive administrations of the questionnaire, in patients who reported no change in their general well-being, or had no changes in disease activity. All correlation coefficients (ICC or Spearman CC) between scores in the two administrations were very high, with the only exception of the social dimensional score in the Swedish study.16 A possible reason for this low correlation according to the authors was the long interval (6-month) between the two administrations. In the two remaining studies,10,12 reliability was tested by comparing IBDQ scores between the first and second administration, but information on

267

Inflamm Bowel Dis • Volume 10, Number 3, May 2004

Pallis et al

TABLE 3. Test-Retest Reliability of the IBDQ Expressed by the Correlation of the IBDQ Dimensional Scores at Two Consequent Assessments

De Boer et al Russel et al CD patients UC patients Han et al* Kim et al Cheung et al** Lopez-Vivancos et al CD patients UC patients Hjortswang et al Pallis et al CD patients UC patients Leong et al CD patients UC patients

n

Bowel

Systemic

Emotional

Social

Statistical Method



NR

NR

NR

NR



48 27 30 — 45

0.93 0.87 0.79 NR 0.90

0.75 0.91 0.71 NR 0.90

0.78 0.78 0.79 NR 0.87

0.88 0.91 0.88 NR 0.96

58 72 75

0.85 0.70 0.71

0.81 0.81 0.75

0.77 0.79 0.40

0.67 0.80 0.76

ICC

14 14

0.988 0.940

0.933 0.981

0.998 0.998

0.993 0.986

ICC

38 38

0.876 0.757

0.832 0.878

0.906 0.871

0.916 0.901

ICC

ICC ICC ICC

Spearman’s R

UC, ulcerative colitis; CD, Crohn’s disease; ICC, Intraclass Correlation Coefficient; NR, not reported. *ICC between all three administrations. **Modified IBDQ with two bowel dimensions.

changes in patient’s disease activity status between the administrations is not provided. Internal validity was assessed in four studies. The first Dutch12 and the first English10 studies showed good degree of internal validity. Principal component analysis of the IBDQ items was carried out in both the Swedish16 and the second English11 study and did not support the dimensionality of four scores. The Swedish study yielded six dimensions, whereas the English study five; both studies found the original “bowel function” dimension unstable and its items clustered in two parts (not the same). Sensitivity to change was assessed by comparing the four IBDQ dimensional scores among patients who reported change between the first and the second administration of the questionnaire. All six studies dealing with this concept11,13,15–18 proved that IBDQ is a sensitive instrument to quantify changes in HRQoL relative to clinical activity changes in UC and CD. All dimensional scores differed significantly between the baseline and the follow-up measurement. The only exception was the Dutch study by Russel et al,13 where while all four IBDQ dimensions were significantly different between the two administrations in patients with CD, only the bowel symptoms dimension was different in patients with UC. Assessing the HRQoL in an objective and reliable way is considered as prerequisite of well-designed IBD trials and will

268

be increasingly important in the future. The IBDQ has been shown to have adequate validity and reliability, even in countries with different languages, cultures, and ways of life. It shows significant correlation with disease activity indexes and generic instruments. Thus, this instrument is recommended for use in healthcare research or evaluations to assess the HRQoL in IBD patients.

DEDICATION Ioannis Vlachonikolis, Professor of Biostatistics at the Medical Faculty of the University of Crete, Greece, was a biostatistician dedicated to the advancement of the study of cancer and chronic disease. For nearly 30 years he pioneered methodological research in statistical and mathematical applications for collaborative multidisciplinary projects in medicine, epidemiology, biology, and psychiatry. Ioannis had a brilliant mind; he was a provocative thinker, but above all he was a good friend and generous colleague. After suffering a devastating neurological disease, Ioannis Vlachonikolis died on September 20, 2003, aged 51. He leaves a wife and son. The memory of his work, and of his persona, will long remain an inspiration for his friends. REFERENCES 1. Guyatt G, Mitchell A, Irvine EJ, et al. A new measure of health status for clinical trials in inflammatory bowel disease. Gastroenterology. 1989;96: 804–810.

© 2004 Lippincott Williams & Wilkins

Inflamm Bowel Dis • Volume 10, Number 3, May 2004

2. Mitchell A, Guyatt G, Singer J, et al. Quality of life in patients with inflammatory bowel disease. J Clin Gastroenterol. 1988;10:306–310. 3. Irvine EJ. Quality of Life—measurement in inflammatory bowel disease. Scand J Gastroenterol Suppl. 1993;199:36–39. 4. Irvine EJ, Feagan BG, Wong CJ. Does self-administration of a quality of life index for inflammatory bowel disease change the results? J Clin Epidemiol. 1996;49:1177–1185. 5. Irvine EJ, Zhou Q, Thompson AK. The Short Inflammatory Bowel Disease Questionnaire: a quality of life instrument for community physicians managing inflammatory bowel disease. CCRPT Investigators. Canadian Crohn’s Relapse Prevention Trial. Am.J.Gastroenterol. 1996;91:1571– 1578. 6. Guyatt GH, Feeny DH, Patrick DL. Measuring health-related quality of life. Ann Intern Med. 1993;118:622–629. 7. Guillemin F, Bombardier C, Beaton D. Cross-cultural adaptation of health-related quality of life measures: literature review and proposed guidelines. J Clin Epidemiol. 1993;46:1417–1432. 8. Assessing health status and quality-of-life instruments: attributes and review criteria. QualLife Res. 2002;11(3):193–205. 9. MAPI Research Institute. Cultural adaptation of quality of life instruments. Qual Life Newslett. 1997;18:10–11. 10. Han SW, McColl E, Steen N, et al. The inflammatory bowel disease questionnaire: a valid and reliable measure in ulcerative colitis patients in the North East of England. Scand J Gastroenterol. 1998;33:961–966. 11. Cheung WY, Garratt AM, Russell IT, et al. The UK IBDQ-a British version of the inflammatory bowel disease questionnaire. development and validation. J Clin Epidemiol. 2000;53:297–306. 12. de Boer AG, Wijker W, Bartelsman JF, et al. Inflammatory Bowel Disease Questionnaire: cross-cultural adaptation and further validation. Eur J Gastroenterol Hepatol. 1995;7:1043–1050. 13. Russel MG, Pastoor CJ, Brandon S, et al. Validation of the Dutch translation of the Inflammatory Bowel Disease Questionnaire (IBDQ): a health-related quality of life questionnaire in inflammatory bowel disease. Digestion. 1997;58:282–288. 14. Kim WH, Cho YS, Yoo HM, et al. Quality of life in Korean patients with inflammatory bowel diseases: ulcerative colitis, Crohn’s disease and intestinal Behcet’s disease. Int J Colorectal Dis. 1999;14:52–57. 15. Lopez-Vivancos J, Casellas F, Badia X, et al. Validation of the spanish version of the inflammatory bowel disease questionnaire on ulcerative colitis and Crohn’s disease. Digestion. 1999;60:274–280. 16. Hjortswang H, Jarnerot G, Curman B, et al. Validation of the inflammatory bowel disease questionnaire in Swedish patients with ulcerative colitis. Scand J Gastroenterol. 2001;36:77–85. 17. Pallis AG, Vlachonikolis IG, Mouzas IA. Quality of life of Greek patients with inflammatory bowel disease. Validation of the Greek translation of the inflammatory bowel disease questionnaire. Digestion. 2001;63:240– 246. 18. Leong RW, Lee YT, Ching JY, et al. Quality of life in Chinese patients with inflammatory bowel disease: validation of the Chinese translation of the Inflammatory Bowel Disease Questionnaire. Aliment Pharmacol Ther. 2003;17:711–718. 19. Love JR, Irvine EJ, Fedorak RN. Quality of life in inflammatory bowel disease. J Clin Gastroenterol. 1992;14:15–19.

© 2004 Lippincott Williams & Wilkins

20. Irvine EJ, Feagan B, Rochon J, et al. Quality of life: a valid and reliable measure of therapeutic efficacy in the treatment of inflammatory bowel disease. Canadian Crohn’s Relapse Prevention Trial Study Group. Gastroenterology. 1994;106:287–296. 21. Stewart AL, Greenfield S, Hays RD, et al. Functional status and wellbeing of patients with chronic conditions. Results from the Medical Outcomes Study. JAMA. 1989;262:907–913. 22. Stewart AL, Hays RD, Ware JE Jr. The MOS short-form general health survey. Reliability and validity in a patient population. Med Care. 1988; 26:724–735. 23. Radloff LS. The CES-D scale: a self-report depression scale for research in the general population. Appl Psychol Meas. 1977;1:385–401. 24. Sherbourne CD, Stewart AL. The MOS social support survey. Soc Sci Med. 1991;32:705–714. 25. Sherbourne CD, Meredith LS, Rogers W, et al. Social support and stressful life events: age differences in their effects on health-related quality of life among the chronically ill. Qual Life Res. 1992;1:235–246. 26. Ware JE Jr, Sherbourne CD. The MOS 36-item short-form health survey (SF-36). I. Conceptual framework and item selection. Med Care. 1992; 30:473–483. 27. Harvey RF, Bradshaw JM. A simple index of Crohn’s-disease activity. Lancet. 1980;1:514. 28. Levenstein S, Prantera C, Varvo V, et al. Psychological stress and disease activity in ulcerative colitis: a multidimensional cross-sectional study. Am J Gastroenterol. 1994;89:1219–1225. 29. Lichtiger S, Present DH, Kornbluth A, et al. Cyclosporine in severe ulcerative colitis refractory to steroid therapy. N Engl J Med. 1994;330: 1841–1845. 30. Powell-Tuck J, Day DW, Buckell NA, et al. Correlations between defined sigmoidoscopic appearances and other measures of disease activity in ulcerative colitis. Dig Dis Sci. 1982;27:533–537. 31. Best WR, Becktel JM, Singleton JW, et al. Development of a Crohn’s disease activity index. National Cooperative Crohn’s Disease Study. Gastroenterology. 1976;70:439–444. 32. Dupuy HJ. The Psychological General Well-Being (PGWB) Index. In: Wegner NK, Mattson ME, Fuberg CP, eds. Assessment of Quality of LIfe in Clinical trials of Cardiovascular Therapies. New York: Le Jacq, 1984. 33. The EuroQol Group. EuroQol—a new facility for the measurement of health-related quality of life. Health Policy. 1990;16:199–208. 34. Rachmilewitz D. Coated mesalazine (5-aminosalicylic acid) versus sulphasalazine in the treatment of active ulcerative colitis: a randomised trial. BMJ. 1989;298:82–86. 35. Drossman DA, Leserman J, Li ZM, et al. The rating form of IBD patient concerns: a new measure of health status. Psychosom Med. 1991;53:701– 712. 36. Baron JH, Connell AM, Lennard-Jones JE. Variation between observers in describing mucosal appearances in proctocolitis. BMJ. 1964;1:89–96. 37. Ginsberg AL, Beck LS, McIntosh TM, et al. Treatment of left-sided ulcerative colitis with 4-aminosalicylic acid enemas. A double-blind, placebo-controlled trial. Ann Intern Med. 1988;108:195–199. 38. Mardia KV, Kent JT, Bibby JM. Multivariate Analysis. London: Academic Press 1982.

269