Module 4 Confidence Intervals

Module 4 Confidence Intervals Objective: At the completion of this module you will learn how to take into account the sampling variability or uncertai...

Author: Moses Cole

1 downloads 2 Views 383KB Size

Report

Download PDF

Recommend Documents

Confidence Intervals

Lecture 4. Maximum Likelihood Estimation - confidence intervals

Chapter 9: Confidence Intervals. Statistical Estimation Point Estimation Interval Estimation. Confidence Intervals One-sided Confidence Intervals

Statistical Inference. Confidence Intervals

Notes 7: Confidence Intervals

Better Binomial Confidence Intervals

Bootstrap Confidence Intervals

Confidence Intervals for Ranks

CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

1-Sample Inference: Confidence Intervals

Confidence intervals and other statistical intervals in metrology

Constructing Confidence Intervals based on Register Statistics

From sampling distributions to confidence intervals. Sociology 360 Statistics for Sociologists I Chapter 14 Confidence Intervals

Statistical inference using bootstrap confidence intervals Michael Wood Bootstrap confidence intervals

Chapter 6: Confidence intervals and hypothesis tests

From confidence intervals to t teststests

Confidence Intervals for Michaelis-Menten Parameters

Lecture 7. Point estimation and confidence intervals

Chapter 19 Confidence Intervals for Proportions

Visual Hypothesis Testing with Confidence Intervals

CONFIDENCE INTERVALS FOR TOURISM DEMAND ELASTICITY

Chapter 19 Confidence Intervals for Proportions

Lecture 10: Confidence intervals & Hypothesis testing

Module 4 Confidence Intervals Objective: At the completion of this module you will learn how to take into account the sampling variability or uncertainty in the sample estimate of parameters in the population, e.g., population mean and proportion and also you will know how to compare two groups of patients.

4.1 Introduction In Section 3.3 of Module 3 we discussed that the mean, proportion, relative risk and odds ratio are unknown for a population. These unknown quantities in the population are known as the parameters and they are estimated from the sample data.Two methods those are commonly used for estimating parameters are: (a) Point Estimation and (b) Interval Estimation. Point estimation involves calculation of a single number as an estimate for the parameter of interest. For example, let us assume that we are interested in the average/mean body mass index (BMI) of cardiac surgery patients in Victoria. Calculation of the true average BMI is indeed difficult but it can be estimated from the sample data. Consider a random sample of 30 patients from the cardiac surgery population and calculate the sample mean BMI (26.86 kg/m2, see Table 4.1). This sample mean is a point estimate of the true mean BMI of the cardiac surgery patients in Victoria. A point estimate, however, does not provide any information about the inherent variability of the estimator; we do not know how close the sample estimate is to the true parameter. Table 4.1: Body mass index (kg/m 2 ) in a sample of 30 patients 26.0

25.3

26.6

26.1

28.5

24.8

30.1

26.5

31.1

19.7

21.6

27.5

29.8

29.7

25.9

30.6

26.8

25.7

27.5

20.8

27.0

31.0

30.2

25.2

27.5

25.1

31.2

28.0

22.8

27.2

A sample mean is rarely the same as the true mean. A difference between the sample mean and the true mean may occur purely by chance or sampling variation. So it is sensible to estimate the true mean by an interval centred on the sample mean called a 55

Confidence Interval (see the following figure). Confidence intervals take into account the sampling variability or uncertainty in the sample estimates by incorporating standard error in its calculation procedures. Consider the following graph – let us assume that the oval on the horizontal line is the sample mean BMI or the estimated true mean BMI and the vertical lines on the edges of the horizontal line are the two limits known as confidence limits or confidence intervals for the true mean BMI. It is expected that the true mean BMI will be within these limits. The confidence interval has an associated percentage, for example 95%, to show how confident we are that this interval contains the true mean. Since we put some confidence in this interval procedure, the interval is called a Confidence Interval. Sample Mean BMI (26.86 kg/m2)

Confidence Interval Confidence intervals can be calculated for various parameters of interest such as mean, proportion, relative risk, odds ratio, etc. However, in this module we will discuss confidence intervals for the true mean and true proportion – confidence interval for relative risk and odds ratio will be discussed in Module 7. Note that the confidence interval for the true mean (or simply mean) is appropriate for continuous data and the confidence interval for the true proportion (or simply proportion) is appropriate for categorical data. Calculations of confidence intervals for true mean always assume that the sampled population is normally distributed. In this Module, we will discuss on following topics: • • • • •

Confidence intervals for a single true mean Comparing two groups: confidence intervals for the difference between mean of two independent populations Confidence intervals for a single true proportion Comparing two groups: confidence intervals for the difference between true proportions of two independent populations Comparing two groups: confidence intervals for the difference between two true means, paired data (paired samples)

Notations in this module are: • Sample size: n • True/population mean: μ • Sample mean: x • True/population standard deviation: σ • Sample standard deviation: s • Standard error: SE • True/population proportion: π

56

• • • •

Number of patients in a sample of n patients with a special characteristic: r Sample proportion: p = r / n The normal distribution multiplier: Ζ The t -distribution multiplier: Τ

4.2 Confidence Intervals for a Single True Mean In medical research sometimes we are interested in calculating confidence intervals for a single true mean. For example, assume that we are interested in the confidence interval for the true mean BMI of cardiac surgery patients in Victoria. As discussed in Module 3, for a large sample, the sampling distribution of the sample mean (the distribution of mean in repeated samples) follows the normal distribution regardless of the shape of the sampled population (see Figure 3.5, Module 3). Thus, according to the normal distribution probability law as discussed in Module 3: (a) 68% of the sample means will be within one SE of the true mean, (b) 95% of the sample means will be within 1.96 (or approximately 2) SE of the true mean and (c) 99% of the sample means lie within 2.58 ( or approximately 3) SE of the true mean. If a sample mean is within 2 SE of the true mean, the true mean is also within 2 SE of the sample mean. This means that if we draw many samples and calculate the interval “ Sample Mean ± 2.0 × SE of Mean ” for each sample, 95% of these intervals will include the true mean and 5% will not include it. Equivalently, in notation this interval can be written as: x ± 2.0 × SE , where x is the sample mean and SE is the standard error for the mean. This interval is called the 95% confidence interval for the true mean (or simply for the mean) because in 95% of the samples the true mean will be covered by this random interval.

More specifically, if we take, for example, 100 random samples of the same size, each sample may yield a different 95% confidence interval. Among these 100 95% confidence intervals, we expect 95 to cover the true mean, 5 not to cover it. In other words, if we were to select 100 random samples of the same size from the population and to calculate 100 different confidence intervals for the true mean, approximately 95 of the intervals would cover the true mean and 5 would not cover it. We never know if the confidence interval calculated from a sample is a good one (actually contains the true mean) or a bad one (does not contain the true mean). All we know is that in the long run, 95% of the confidence intervals calculated are good because they include the true mean. Let us draw 20 samples each of size 30 from the cardiac surgery population and for each sample calculate the sample mean BMI and SE for the mean. Then we calculate x ± 2.0 × SE , the 95% confidence interval, for each sample; these intervals are presented in Figure 4.1.

57

Figure 4.1: Confidence intervals for μ (true mean) from repeated samples Sample mean and 95% CI for 20 samples each of size 30 34

Sample BMI mean

32

30

28

26 0

5

10 Random sample number

Sample mean

15

20

Mean coverage

A description of the Figure 4.1 is as follows: • The horizontal line is the unknown true mean BMI for cardiac surgery patients in Victoria. • The thick dot in the middle of each vertical line show the sample mean for each sample. • Edges of each small vertical line are lower and upper intervals calculated using the formula x ± 2.0 × SE . • All the intervals except the first interval from the left include the true mean. A 95% confidence interval is common in health science research. However, one may be interested in calculating confidence intervals for other confidence levels e.g., 90%, 93%, 98% etc. Further, if the sample size is not sufficiently large the reference ranges ±1 × SE, ±2 × SE and ±3 × SE do not hold (How large is a large? There is no specific answer for this question however some text books consider a sample of size grater than 30 as large). This means, for small samples the interval x ± 2.0 × SE may not covers the true mean in 95% of the samples. Similarly, the interval x ± 3 × SE may fail to include the true mean in 99% of the samples. Hence, we require a general formula for constructing confidence intervals for a single true mean which is as follows.

Sample Mean ± Multiplier × SE of Mean Two multipliers, namely the normal distribution multiplier denoted by Z and the t distribution multiplier denoted by T are widely used in the construction of confidence intervals. The value of a multiplier relates to the amount of confidence (e.g., 95%) used for obtaining a confidence interval.

58

How do we choose between the Z and T multipliers?

Usually if the sample size is small we use the t -distribution multiplier and use the normal distribution multiplier when the sample size is large. However, the t distribution and normal distribution multipliers are the same for large samples, therefore, the t -distribution multiplier can be used for both small and large samples. So throughout this module we use the t-distribution multiplier T for constructing confidence intervals for the population mean. Thus, the formula for the confidence interval for the true mean is given by

Sample Mean ± T × SE of Mean Here, SE =

s

; s is the sample standard deviation. Thus, the final formula for n confidence interval for a single true mean is as follows: s Sample Mean ± T × n Confidence intervals for a true mean can be calculated under either of the following two different assumptions: (a) the true standard deviation is known and (b) the true standard deviation is unknown. In practice, it is unlikely that the true mean is unknown while the true standard deviation is known. Therefore in this module we will discuss confidence intervals under the assumption that the true standard deviation is unknown.

How to calculate multipliers? The T multiplier value is obtained from the t-distribution table, Table 4.7 presented at the end of this module. For a small sample size, the T value in Table 4.7 depends on: • The degrees of freedom (d.f.) of the t-distribution; the d.f. of the t-distribution for a single sample data can be obtained by subtracting the value of one from the total number of observations and • The confidence level (e.g., 95%). The first column in Table 4.7 shows the d.f. and the first row presents the confidence level. For example, let us assume that we want to calculate a 95% confidence interval for the true mean BMI; draw a random sample of 30 patients from the population. Then the t-distribution multiplier value T can be obtained as stated in the following steps.

• • • •

Step 1: Open the t- distribution table, Table 4.7 at the end of this module. Step 2: Go to the row with d.f. of 29 (d.f. = sample size minus one) in the first column. Step 3: Then go along to the column for the 95% confidence level. Step 4: The value in the intersection of d.f. of 29 and confidence level of 95% is the required T-value – here the T-value is 2.045.

Note: If the d.f. does not exactly match with any of those presented in Table 4.7, then round the d.f. value to the nearest value in the table. For example, consider a d.f. of 37 which lies between d.f. of 30 and 40 – we round this d.f. value to 40 (the nearest value) and hence T =2.021. If the d.f. is larger than 120, we consider it as infinite ( ∞ , see the last row of Table 4.7). 59

Assumptions for Confidence Intervals: We make the true mean: o o o

following assumptions when constructing confidence intervals for the The Sample is drawn randomly. Observations within a sample are independent. The Sampled population is normally distributed.

Steps for Construction of Confidence Intervals: We can calculate confidence intervals following the steps stated below:

• • • •

Draw a random sample and then calculate the sample mean and the sample standard deviation (use Excel). Compute the SE of the mean. Calculate the d.f. and then find the T-value from the t-distribution table (Table 4.7 in the appendix) Put the sample mean, the multiplier Τ and the SE together in the following formula to give the interval of plausible values for the true mean. s Sample Mean ± T × n

Consider the variable body mass index (BMI) for the cardiac surgery patients in Victoria – as discussed earlier consider the BMI data in Table 4.1 and calculate the sample mean and SE of the mean. We are interested in the 95% confidence interval for the true mean BMI or simply the mean BMI in the population. Calculation of 95% confidence interval (you can use Microsoft Excel for most of the calculations shown below): • Sample size: n = 30 • Degrees of freedom: d.f. = n − 1 = 29 • Sample mean: x = 26.86 kg/m 2 • Sample standard deviation: s = 2.995 kg/m2 s 2.995 • Standard error for mean: SE = = = 0.547 kg/m2 n 30 • Multiplier: T = 2.045 (see Table 4.7) • Lower limit = Sample Mean − T × SE = 26.86 − 2.045 × 0.547 = 25.74 • Upper limit = Sample Mean + T × SE = 26.86 + 2.045 × 0.547 = 27.98 • Confidence interval is (25.74, 27.98 kg/m 2 ) We are 95% confident that the mean BMI of cardiac surgery patients in Victoria is in the interval 25.74 to 27.98 kg/m2 – this indicates that in general the patients who had cardiac surgery are overweight (Healthy: 18.5≤BMI≤25; overweight: 2530 kg/m2.

60

Width of a Confidence Interval: The width of a confidence interval is the difference between the upper limit and lower limit of the interval (in the above example the width is: 27.98 – 25.74 = 2. 24 kg/m2). A smaller width shows a better confidence interval and vice versa. In practice, we prefer higher confidence level and narrower width of intervals – a narrower width indicates smaller sampling variability or uncertainty in the sample estimate of the population/true mean. If the confidence level increases, the width increases and vice versa. However, the width of a confidence interval can be reduced without compromising the higher confidence level by increasing the sample size. If the sample size increases, the standard error decreases (uncertainty decreases); this results in a narrower confidence interval.

4.3 Comparing Two Groups Comparison between two groups of patients is very common in medical research as well as other scientific research. For example, a clinician may be interested in comparing a new drug A with an old drug B, a baby food producer would like to compare whether his product elevates baby weights faster than a product produced by another company, a research nurse may wish to compare the serum iron level for two groups of children, a cardiologist may be interested in comparing the preoperative creatinine level of patients by their diabetic status, a researcher may compare the BMI between male and female cardiac surgery patients, etc. Two groups of patients can be compared by comparing specific parameters of interest from each group (e.g., mean, proportion). The two methods that are commonly used for comparing parameters of two groups are “confidence intervals” and “hypothesis testing”. In this module, we discuss the method of confidence interval. The hypothesis testing technique will be discussed in Module 5. In this section, we compare two groups by constructing a confidence interval for the difference in two true means. The construction of a confidence interval for the difference between two true means also requires the knowledge of sampling distribution for the difference in two sample means. As discussed in Module 3, for large sample, the sampling distribution for the difference in two sample means follows normal distribution.

4.3.1 CI for the difference in two true means Since the sampling distribution for the difference between two sample means follows a normal distribution, 95% of the differences in sample means fall within 2 SE of the population mean difference. Then 95% of the time the difference between population means will also be within 2 SE of the difference in sample means. Similarly, in 99% of the cases the true mean difference will be within 3 SE of the difference in sample means. Thus the general formula for calculating confidence intervals for the difference between two true means is given by:

61

( Difference in Two Sample Means ± Multiplier × SE for the Difference Between Two Sample Means) The formula for SE will be discussed later; the manual calculation of the SE is bit complex but Microsoft Excel can help us to calculate it. As was discussed in Section 4.2, we use the t-distribution multiplier T for large as well as small samples. Thus the final formula for confidence intervals for the difference in two true means is given by:

Diff ± T × SE We encounter mainly two different possibilities for calculating confidence intervals for the difference between two true means, they are: (a) Case 1: True standard deviations are unknown, but assumed equal and (b) Case 2: True standard deviations are unknown, but assumed unequal. Many text books consider the case where the population standard deviations are known, however in medical research we avoid making assumptions that we are unlikely to meet in real life. In fact, it is impractical that the true means are unknown while the true standard deviations are known. How do we choose between Case 1 and Case 2 in real life? • •

We calculate the standard deviation for each sample data - if the standard deviations are close to each other we use Case 1, otherwise use Case 2. There is no rule and thumb on how close is a close, just use your judgement. Alternatively, statistical theory can be used to assess the significance of the difference between two standard deviations however this topic is beyond the scope of this subject.

Assumptions: o Sampled populations are independent and normally distributed. o Samples are drawn randomly. o Observations within a sample are independent.

Steps for the calculation of confidence intervals: • • • • • •

Calculate the sample means and standard deviations (use Excel) Calculate the difference in sample means Calculate the SE for the difference in two sample means (formula depends on the choice between Case 1 and Case 2) Calculate the d.f. (formula depends on the assumptions of the equality of standard deviations) Obtain the t-distribution multiplier, T (see Table 4.7) Finally, put the value of the sample mean difference, SE and T into the formula for confidence intervals and do some calculations – this will give the confidence interval for the difference in two true means.

62

Case 1: True SDs are Unknown but Assumed Equal Consider a study that was conducted to investigate the risk factors for heart disease among male and female patients in Victoria. One of the characteristics examined was body mass index (BMI), a measure of the extent to which an individual is overweight. We wish to determine whether the mean BMI of the male patients is equal to that of the female patients in the population. Let us assume that we have a random sample of size 25 from the male cardiac surgery patiens and another random sample of size 19 from the female patients. The BMI of male and female patients in the sample are presented in Table 4.2. Table 4.2: Body mass indices (kg/m2) in samples of 25 male and 19 female patients

27.8 22.0 30.5 34.6 29.6 24.5 24.8 30.5 34.3 30.0

BMI of Male Patients 25.8 29.4 29.7 31.0 27.5 21.9 29.4 36.5 27.0 29.0 44.5 22.7 32.8 24.8 29.6

BMI of Female Patients 25.4 24.4 29.4 27.9 29.3 31.6 32.7 25.4 23.3 34.9 28.6 21.9 30.1 23.2 19.1 20.1 35.1 28.9 22.5

Using Microsoft Excel, the sample mean and standard deviation for BMI of male patients are respectively 29.21 kg/m 2 and 4.97 kg/m 2 . Similarly, for female patients’ the sample mean and standard deviation are respectively 27.04 kg/m 2 and 4.75 kg/m 2 . Clearly, the sample standard deviations are close enough to assume that the population standard deviations are equal. The formula for the SE is as follows: SE =

s 2p n1

+

s 2p n2

, where s 2p =

(n1 − 1) × s12 + (n2 − 1) × s 22 n1 + n2 − 2

The d.f. is calculated using the formula: d.f. = (n1 − 1) + (n 2 − 1) = n1 + n 2 − 2 . Here n1 and n2 are sample sizes and s 1 and s 2 are standard deviations for male and female patients respectively. The pooled variance s 2p can be calculated using Microsoft Excel (see Excel Help posted on MUSO WebCT). Calculations:

• • • •

Sample mean difference: (Male – Female) = 29.21 – 27.04 = 2.17 kg/m 2 Degrees of freedom: d.f. = 25 + 19 – 2 = 44 – 2 = 42 Multiplier: Τ = 2.021 Pooled variance: s 2p = 23.81 63

s 2p

s 2p

23.81 23.81 + = 1.48 25 19

•

Standard error: SE =

• • •

Lower limit = Diff − T × SE = 2.17 − 2.021 × 1.49 = −0.833 Upper limit = Diff + T × SE = 2.17 − 2.021 × 1.49 = 5.165 Hence, the 95% confidence interval is (-0.833 kg/m 2 , 5.165 kg/m 2 )

n1

+

n2

=

On the basis of the sample data, we are 95% confident that the difference between the true means of BMI for male and female patients lies between -0.833 kg/m 2 and 5.165 kg/m2. Since this interval includes zero we conclude that the difference between means BMI for male and female patients in the cardiac surgery population in Victoria is not statistically significant. This means the sample data does not support a difference. Note that if the confidence interval for the difference in means of two populations include zero, the difference is not statistically significant, otherwise the difference is significant. A significant difference means that we have enough evidence from the sample data to say that there exists a difference between two population means otherwise we do not have enough evidence to conclude a difference.

Case 2: True SDs are Unknown but Assumed Unequal We now turn to the situation where the true standard deviations are unknown but assumed unequal. This case is likely to be encountered in real life medical research as well as other data. Let us consider a study where the investigator is interested in comparing the preoperative creatinine level for cardiac surgery patients with and without diabetes. A random sample of 50 patients was selected from each group (population). The data is shown in Table 4.3. Using Microsoft Excel, the sample mean and standard deviation for preoperative creatinine levels for diabetics and non-diabetics in Table 3.4 are respectively 0.1226 and 0.1095 mmol/L and 0.0991 and 0.0346 mmol/L. The standard error is calculated using the following formula: SE =

s12 s 22 + n1 n2

For this case, the formula for d.f. is complex, so use Microsoft Excel to calculate it. The formula is as follows:

d.f. =

⎛ s12 s 22 ⎞ ⎜⎜ + ⎟⎟ ⎝ n1 n 2 ⎠ 2

2

2

⎛ s12 ⎞ ⎛ s 22 ⎞ ⎜⎜ ⎟⎟ ⎜⎜ ⎟⎟ ⎝ n1 ⎠ + ⎝ n 2 ⎠ n1 − 1 n2 − 1

64

Table 4.3: Preoperative creatinine level (mmol/L) for 50 diabetic and 50 nondiabetics cardiac surgery patients Diabetes

Non-diabetes

0.09

0.09

0.08

0.16

0.13

0.08

0.12

0.09

0.076

0.12

0.082

0.10

0.09

0.18

0.06

0.13

0.094

0.11

0.09

0.10

0.08

0.06

0.09

0.10

0.42

0.05

0.08

0.07

0.09

0.07

0.09

0.09

0.09

0.10

0.10

0.11

0.11

0.05

0.12

0.08

0.13

0.13 0.54

0.09

0.08

0.08

0.01 0.11

0.10

0.18

0.12

0.09

0.154

0.11

0.10

0.193

0.13

0.09

0.07

0.08

0.10

0.085

0.09

0.141

0.11

0.17

0.08

0.09

0.09

0.09

0.10

0.07

0.09

0.06

0.05

0.07

0.17

0.08

0.08

0.09

0.09

0.12

0.12

0.09

0.13

0.08

0.63

0.07

0.11

0.09

0.10

0.143 0.106 0.09 0.07

0.09 0.09 0.102 0.14 0.01

Calculations:

• •

Difference in means = (Diabetes – Non-diabetes) = 0.1226 – 0.0991 = 0.0235 Degrees of freedom: d.f. = 59 (using Excel)

s12 s 22 + = n1 n2

(0.1095) 2 (0.0346) 2 + = 0.0162 50 50

•

SE =

• • • •

o s1 = 0.1095 and s 2 = 0.0346 mmol/L Multiplier T = 2.0 Lower limit = Diff – T * SE = 0.0235 – 2.0 * 0.0162 = -0.0089 Upper limit = Diff + T * SE = 0.0235 + 2.0 * 0.0162 = 0.0560 95% confidence interval is (-0.0089, 0.0560 mmol/L)

Thus, we are 95% confident that the difference in preoperative mean creatinine level for diabetic and non-diabetic cardiac surgery patients in Victoria lies between -0.0089 mmol/L and 0.0560 mmol/L. The difference in population means of creatinine level for diabetes and non-diabetes is not statistically significant because the interval does not exclude the value of zero. Thus we do not have sufficient evidence from the data to conclude a difference between the means of the preoperative creatinine levels for these two groups of patients in the population.

65

4.4 Confidence Intervals for the Population Proportion According to the central limit theorem for large samples, 95% of the sample proportions in repeated sampling fall within 1.96 SE of the population proportion. Therefore, 95% of the times the population proportion will be within 1.96 SE of the sample proportion. Similarly, in 99% of the samples the population proportion will be within 2.57 SE of the sample proportion. Thus the formula for confidence intervals for the population proportion is given by:

Sample proportion ± Ζ × SE of sample proportion Here we always use Z-multiplier because the sampling distribution for the sample proportion is always normal provided the sample size is large. The value of Z-multiplier for confidence levels of 90%, 95% and 99% are respectively 1.65, 1.96 and 2.57 (see Table 4.8 at the end of this module). The SE for the sample proportion is given by p × (1 − p ) / n , here n is the sample size and ϑ is the estimated population proportion or simply the sample proportion.

Let us assume that we want to calculate the confidence interval for the proportion of cardiac surgery patiens who are diabetic and have preoperative creatinine level greater than 0.133mmol/L in the population. Consider the data in Table 4.4. The calculation of the 95% confidence interval is as follows: • • • • • • • •

Sample size: 50 No. of patients in the sample with creatinine level > 0.133 mmol/L: 8 Sample proportion: 8/50 = 0.16 0.16 × (1 − 0.16) SE for proportion: p × (1 − p) / n = = 0.051846 50 Multiplier: Z = 1.96 (see Table 4.8 at the end of this module) Lower limit: p − Ζ × SE = 0.16 − 1.96 × 0.051846 = 0.058 Upper limit: p − Ζ × SE = 0.16 − 1.96 × 0.051846 = 0.262 Confidence interval is: (0.058, 0.262)

We are 95% confident that the proportion of diabetic cardiac surgery patients in Victoria with preoperative creatinine level grater than 0.133mmol/L is between 0.058 and 0.262.

4.4.2 Comparing Two Groups: CI for the difference in two population proportions Comparison between two groups of patients is also very common when the information from each patient was collected or recoded in a categorical scale. This type of comparison arises in both observational and experimental studies. For 66

example, a researcher may be interested in comparing the mortality of cardiac surgery patients between two hospitals; a clinician may compare the two groups of cancer patients reported on pain relieve where one group taking treatment and another group is on placebo; a cardiac surgeon may compare the proportions of obesity between male and female cardiac surgery patients in the populations, etc. The above comparisons can be done by constructing a confidence interval for the difference between two population proportions. Construction of this confidence interval requires the knowledge of sampling distribution for the difference between sample proportions calculated from two groups of patients. It has been discussed in Module 3 that for large samples the sampling distribution for the difference between two sample proportions approaches the normal distribution and hence in 95% of the cases the difference in two population proportions falls within 1.96 standard error of the difference between sample proportions. Similarly 99% of the times the population proportion difference falls within 2.57 standard error of the sample proportion difference. Thus the general formula for the confidence interval for the difference between two population proportions is given by: ( Difference in Sample proportions ± Ζ × SE for the Difference Between Sample proportions )

The standard error for the difference in sample proportions is given by: SE =

p1 (1 − p1 ) p 2 (1 − p 2 ) + n1 n2

Here n1 and n 2 and p1 and p 2 are respectively sample sizes and sample proportions for the first group (e.g., treatment) and second group (e.g., placebo) of patients. Consider a study where we want to compare the risk of overweight between male and female patients. We have random sample of 40 male and 50 female patients from the population, the data is shown in Table 4.5. If the patients’ BMI is greater than 25 kg/m2 we record it as “1”, otherwise it was recorded as “0”. The data in Table 4.5 shows that 29 out of 40 male patients and 32 out of 50 female patients have BMI above 25 kg/m 2 . Table 4.5: BMI for 40 male and 50 female patients (BMI>25, Yes = 1 and No = 0)

Male

Female

1 1 1 1 0 1 1 0 1

1 1 1 0 1 1 1 0 0

1 1 1 0 1 0 1 1 1

0 0 1 1 1 0 1 0 1

1 1 1 1 1 1 1 1 1

67

1 1 0 1 1 1 0 0 0

1 1 1 0 1 1 1 0 0

0 1 1 1 0 1 1 1 1

1 0 1 0 0 1 1 0 0

1 1 1 1 1 1 0 0 1

In the above study we are interested in comparing the proportion of male and female patients with BMI greater than 25 kg/m 2 . Calculation of 95% confidence interval is as follows: • Sample sizes: male (M): n1 = 40 and female (F): n 2 = 50 . • Sample proportions: M: p1 = 31 / 40 = 77.5% and F: p 2 = 32 / 50 = 64% • Multiplier: Ζ = 1.96 (see Table 4.8) 0.775 × (1 − 0.775) 0.64 × (1 − 0.64) + = 0.094696 • Standard error: SE = 40 50 • Difference between sample proportions (M-F): 0.135 • Lower limit = Diff – Z × SE = 0.135 − 1.96 × 0.094696 = −0.051 • Upper limit = Diff + Z × SE = 0.135 + 1.96 × 0.094696 = 0.321 • 95% Confidence Interval is (-0.051, 0.321)

We are 95% confident that the difference in proportion of male and female patients who have BMI above 25 kg/m 2 in the population lies between -0.051 and 0.321. Since the interval includes the value of zero, the difference may not be statistically significant. This means the data does not suggest a difference.

4.5 Comparing Two Groups: Paired Data In Section 4.3 we discussed the difference between two population means assuming that the samples were independent. However, there are studies where the data consists of pairs of measurements. These pairs may be two outcomes measured on the same subjects/patients under two different treatments. The same subjects may be measured before and after receiving some treatments. Also the pairs may be two individuals matched during sample selection to share some key characteristics such as age and sex. Pairs of twins or siblings may be assigned randomly to two treatments in such a way that members of a single pair receive different treatments. It sometimes happens that true differences do not exist between the two populations with respect to the variable of interest, but the presence of extraneous sources of variation may cause rejection of hypothesis of no difference. On the other hand, the differences may be masked by the presence of extraneous factors. The objective in paired comparisons is to eliminate a maximum number of sources of extraneous variation by making the pairs similar with respect to as many variables as possible. This is done by considering the differences between each pair of observations. Thus we convert our data of pairs of values into a single sample of differences. Instead of performing the analysis with individual observations, we use the difference between pairs of observations, as the variable of interest. We denote these differences by d and the standard deviation of differences by s d . Here we assume that the differences between pairs of observations are random and follow the normal distribution. It can be shown that the sampling distribution for the mean of differences follows the normal distribution. Thus, 95% of the means of differences will be within 2 SE of the difference in true means, i.e., in repeated 68

sampling, 95% of the times the difference in true means will be within two SE of the mean of differences. Hence, a 95% confidence interval for the difference in two true means for paired data can be calculated using the following formula. Mean of Differences ± T * SE of Differences Or, equivalently in notation: d ± Τ × SE = d ± Τ ×

Sd n

Here Τ is the t distribution “Multiplier” value with (n − 1) degrees of freedom, n is the number of pairs (see Table 4.7 for T value). Please note that the number of observations in each sample must be the same. Consider a study that was conducted to determine weight loss in obese women before and after 12 weeks of treatment with a very-low-calorie diet (VLCD). The 9 women participating in the study were from an outpatient, hospital based treatment program for obesity. The women’s weights before and after the 12-week VLCD treatment are shown in the following table (Columns 1 & 2) and the difference in weight (after – before) is in column 3. Table 4.6: Weight (kg) loss of 9 women d = After – Before

Weight Before

Weight After

VLCD

VLCD

117.3

83.3

-34.0

111.4

85.9

-25.5

98.6

75.8

-22.8

104.3

82.9

-21.4

105.4

82.3

-23.1

100.4

77.7

-22.7

81.7

62.7

-19.0

89.5

69.0

-20.5

78.2

63.9

-14.3

69

Steps for calculation of a 95% confidence interval are: • • • • • • • •

Number of pairs: n = 9 Sample mean of observed differences: d = −22.59 kg Standard deviation of differences: s d = 5.32 kg s 5.32 Standard Error: SE = d = = 1.77 kg 9 n Multiplier: Τ = 2.306 (see Table 4.7) Lower limit = −22.59 − 2.306 × 1.77 = −26.68 kg Upper limit = −22.59 + 2.306 × 1.77 = −18.50 kg 95% confidence interval is ( − 26.68, − 18.50 kg)

We are 95% confident that the true mean of differences of weights before and 12 weeks after VLCD lies between -26.68 kg and -18.59 kg. Since the interval does not include zero, the difference is statistically significant. This means, the sample data support that the weights before and 12 weeks after VLCD may be different. The 95% confidence interval resulted in negative limits which indicate that VLCD reduces the weight significantly (since we calculated the confidence interval for the difference of after weights minus the before weights, that is, confidence interval for the difference between “After – Before” in the population).

70

Summary of This Module: Terms and notations:

• • • • • • • • • • • • •

Sample mean True/ population mean Sample standard deviation True standard deviation Difference between two sample means Difference between two true means Sample proportion True proportion Difference in two true proportions T-multiplier Z-multiplier Standard error (SE) Confidence intervals (CI) o Paired samples o Independent samples

95% confidence interval – continuous data:

• • • •

Confidence intervals for a single true/population mean Confidence intervals for the difference in two true means – assuming equal standard deviations Confidence intervals for the difference in two true means – assuming unequal standard deviations Confidence intervals for the difference in two true means – paired data

95% confidence interval – categorical data:

• •

Confidence intervals for a single true proportion Confidence intervals for the difference in two true proportions

71

Table 4.7: The t-distribution table for 2-tailed p-values (and confidence level)

d.f. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 40 50 60 70 80 90 100 110 120

α

0.20 (80%) 3.078 1.886 1.638 1.533 1.476 1.440 1.415 1.397 1.383 1.372 1.363 1.356 1.350 1.345 1.341 1.337 1.333 1.330 1.328 1.325 1.323 1.321 1.319 1.318 1.316 1.315 1.314 1.313 1.311 1.310 1.303 1.299 1.296 1.294 1.292 1.291 1.290 1.289 1.289 1.282

p-value (confidence level) 0.10 (90%) 0.05 (95%) 0.02 (98%) 6.314 12.706 31.821 2.920 4.303 6.965 2.353 3.182 4.541 2.132 2.776 3.747 2.015 2.571 3.365 1.943 2.447 3.143 1.895 2.365 2.998 1.860 2.306 2.896 1.833 2.262 2.821 1.812 2.228 2.764 1.796 2.201 2.718 1.782 2.179 2.681 1.771 2.160 2.650 1.761 2.145 2.624 1.753 2.131 2.602 1.746 2.120 2.583 1.740 2.110 2.567 1.734 2.101 2.552 1.729 2.093 2.539 1.725 2.086 2.528 1.721 2.080 2.518 1.717 2.074 2.508 1.714 2.069 2.500 1.711 2.064 2.492 1.708 2.060 2.485 1.706 2.056 2.479 1.703 2.052 2.473 1.701 2.048 2.467 1.699 2.045 2.462 1.697 2.042 2.457 1.684 2.021 2.423 1.676 2.009 2.403 1.671 2.000 2.390 1.667 1.994 2.381 1.664 1.990 2.374 1.662 1.987 2.368 1.660 1.984 2.364 1.659 1.982 2.361 1.658 1.980 2.358 1.645 1.960 2.327

72

0.01 (99%) 63.657 9.925 5.841 4.604 4.032 3.707 3.499 3.355 3.250 3.169 3.106 3.055 3.012 2.977 2.947 2.921 2.898 2.878 2.861 2.845 2.831 2.819 2.807 2.797 2.787 2.779 2.771 2.763 2.756 2.750 2.704 2.678 2.660 2.648 2.639 2.632 2.626 2.621 2.617 2.576

0.001 636.619 31.599 12.924 8.610 6.869 5.959 5.408 5.041 4.781 4.587 4.437 4.318 4.221 4.140 4.073 4.015 3.965 3.922 3.883 3.850 3.819 3.792 3.768 3.745 3.725 3.707 3.690 3.674 3.659 3.646 3.551 3.496 3.460 3.435 3.416 3.402 3.390 3.381 3.373 3.291

Table 4.8: Normal distribution multiplier ( Ζ ). Confidence Level 90% 95% 99%

73

Ζ -Value 1.65 1.96 2.57