Effect of Smoking on Depressive Symptomatology: A Reexamination of Data from the National Longitudinal Study of Adolescent Health

Vol. 162, No. 5 Printed in U.S.A. DOI: 10.1093/aje/kwi219 American Journal of Epidemiology Copyright ª 2005 by the Johns Hopkins Bloomberg School of ...
Author: Alison Douglas
1 downloads 0 Views 109KB Size
Vol. 162, No. 5 Printed in U.S.A. DOI: 10.1093/aje/kwi219

American Journal of Epidemiology Copyright ª 2005 by the Johns Hopkins Bloomberg School of Public Health All rights reserved

Effect of Smoking on Depressive Symptomatology: A Reexamination of Data from the National Longitudinal Study of Adolescent Health

Brian Duncan and Daniel I. Rees From the Department of Economics, University of Colorado, Denver, CO. Received for publication October 20, 2004; accepted for publication December 22, 2004.

adolescent; depression; health; smoking

Abbreviations: CES-D, Center for Epidemiologic Studies Depression; CI, confidence interval; OR, odds ratio.

Many studies have documented a strong association between smoking and depression (1–8). Two explanations have traditionally been proposed for this association (9). The first is that depression leads to smoking, perhaps because depressed individuals turn to cigarettes as a means of self-medication; the second is that underlying environmental or genetic factors predispose individuals to both smoking and depression. Recently, however, a third explanation for the association between smoking and depression has been proposed, namely, that smoking itself leads to emotional disturbances (10–12). This explanation is supported by a number of influential studies on teenagers and, if true, would have enormous implications both because the treatment and consequences of depression are so costly, and because policymakers would have a potentially powerful tool to reduce the incidence of a widespread illness. Previous researchers in this area have shown that teens who smoke are at increased risk of subsequently developing the symptoms of depression (10, 11). This association, how-

ever, could be driven by unobserved (from the standpoint of the researcher) factors having to do with the home environment or with an individual’s genetic makeup. In order to explore the role played by difficult-to-measure environmental and genetic influences potentially correlated with both smoking and depression, we compare estimates from standard regression models, which can be thought of as providing ‘‘naı¨ve’’ estimates of the effect of smoking on depression, with fixed effects estimates that completely control for time-invariant factors.

MATERIALS AND METHODS

The primary data source for this project is the National Longitudinal Study of Adolescent Health, the same data source used by Goodman and Capitman (11) in their influential study on this topic. Detailed descriptions of the Adolescent Health design and data collection efforts are published elsewhere (13, 14).

Correspondence to Dr. Daniel I. Rees, Department of Economics, University of Colorado, Campus Box 181, Denver, CO 80217-3364 (e-mail: [email protected]).

461

Am J Epidemiol 2005;162:461–470

Downloaded from http://aje.oxfordjournals.org/ at aurarialibrary on July 9, 2013

Using 1995–1996 data from the first two waves of the National Longitudinal Adolescent Health Study, the authors found that respondents who smoked cigarettes scored, on average, three points higher than did nonsmokers on the Center for Epidemiologic Studies Depression (CES-D) Scale. This gap persists even after accounting for observable factors, such as personal and parental characteristics. In contrast, controlling for the influence of unobservable factors potentially correlated with smoking behavior and depression produces smaller estimates. For instance, estimates from a linear regression model augmented with fixed effects suggest that the average male smoker would score 0.84 points higher on the CES-D Scale (95% confidence interval: 0.44, 1.25) than his nonsmoking counterpart; the average female smoker is predicted to score 1.25 points higher on the CES-D Scale (95% confidence interval: 0.75, 1.75) than her nonsmoking counterpart. The authors conclude that, for the average adolescent, the association between smoking and the symptoms of depression can in large part be attributed to the influence of unobservable factors.

462 Duncan and Rees

Measures

To assess cigarette consumption, the Adolescent Health survey included the questions, ‘‘[d]uring the past 30 days, on how many days did you smoke cigarettes?’’ and ‘‘[d]uring the past 30 days, on the days that you smoked, how many cigarettes did you smoke each day?’’ Packs of cigarettes smoked in the past month were calculated as the product of the answers to these two questions, divided by 20 (11). A little more than a quarter of the sample (25.3 percent of males and 25.8 percent of females) smoked at baseline. By follow-up, a little over a third of the sample smoked (33.4 percent of males and 33.9 percent of females). These data are consistent with those from other nationally representative surveys of adolescent smoking from the mid-1990s (15). Male smokers at baseline consumed, on average, 9.33 packs per month, while female smokers consumed, on average, 7.74 packs per month. To assess depressive symptomatology, the Adolescent Health in-home survey included 18 of the 20 items that make up the Center for Epidemiologic Studies Depression (CES-D) Scale (16). For instance, respondents were asked how often during the past week they were ‘‘bothered by things that usually don’t bother you,’’ how often did they not ‘‘feel like eating,’’ and how often did they feel ‘‘like you couldn’t shake off the blues even with help from your family or your friends?’’ (The two missing items from the CES-D questionnaire were ‘‘my sleep was restless’’ and ‘‘I had crying spells.’’) Each response was coded on a scale of 0–3 based on frequency (0 ¼ rarely or none of the time;

1 ¼ some or a little of the time; 2 ¼ occasionally or a moderate amount of the time; and 3 ¼ most or all of the time), and adding up these responses produced a score of between 0 and 54 (Cronbach’s a ¼ 0.86). To facilitate the comparison of our results with those of previous researchers, we rescaled this score to correspond with the 20-item CES-D Scale (11). Thus, respondents could be assigned a CES-D score of between 0 and 60, with higher numbers indicating the presence of more depressive symptoms. The primary focus of the present study is not on depression per se but on its symptoms as measured by the CES-D Scale. Accordingly, the CES-D score is initially treated as a continuous variable. However, previous work has shown that a dichotomized version of the CES-D Scale can be used as a screening instrument for depression in an adolescent population, provided that appropriate cutpoints are chosen (17). Using the cutpoints of Goodman and Capitman (11), we present additional analyses in which the CES-D score is dichotomized. Statistical models

We begin our study by estimating the effect of smoking on depressive symptomatology using a standard linear regression framework in which the CES-D score of individual i at follow-up (t ¼ 2) is related to a set of observable factors and smoking behavior at baseline (t ¼ 1) by the following equation: CES-Di; t¼2 ¼ a þ bXi; t¼1 þ p1 si; t¼1 þ p2 ppmi; t¼1 þ ei; t¼2 ; i ¼ 1; . ..; n;

ð1Þ

where si, t¼1 is a dichotomous variable equal to one for youth who smoked at least one cigarette in the 30 days prior to the baseline interview; ppmi, t¼1 is a measure of smoking intensity equal to the number of cigarettes smoked in the 30 days prior to the baseline interview, divided by 20; Xi, t¼1 is a vector of controls that includes measures of age, disability status, personal and household characteristics, parental education, and a set of county-level variables; and ei, t¼2 is a random error term. The parameter p1 represents the effect of baseline smoking participation on the CES-D score at follow-up (controlling for the variables included in the vector Xi, t¼1), and p2 represents the effect of smoking an additional pack per month. Estimating equation 1 using ordinary least squares will produce unbiased estimates of p1 and p2 if the error term, ei, t¼2, is uncorrelated with smoking behavior. However, if unobservable environmental or genetic factors are correlated with both the CES-D score at follow-up and smoking behavior, then ordinary least squares estimates will be biased. Furthermore, because the baseline CES-D score is not included as an explanatory variable in equation 1, this approach is subject to a problem of reverse causality: That is, if preexisting depression leads to smoking at baseline, then it is inappropriate to interpret ordinary least squares estimates of p1 and p2 as the effect of smoking on depressive symptomatology. We address these problems by taking fuller advantage of the longitudinal nature of the Adolescent Health data by Am J Epidemiol 2005;162:461–470

Downloaded from http://aje.oxfordjournals.org/ at aurarialibrary on July 9, 2013

The Adolescent Health Study began with a stratified, random sample of all high schools with more than 30 students in the United States. Eighty high schools were chosen from this population, and an additional 52 middle or ‘‘feeder’’ schools from the same communities were included in the study. Any student who appeared on the roster of one of these 132 schools was eligible to participate in the Adolescent Health wave I (baseline) in-home survey. The wave I in-home interviews were conducted primarily between May and September of 1995 and produced a nationally representative sample of students aged 11–21 years in grades 7 through 12. Wave II (follow-up) in-home interviews were conducted between April and August of the following year. The mean period between baseline and follow-up interviews was 10.9 months. We analyze data from the wave I and wave II in-home surveys. These surveys contain good measures of tobacco use and depressive symptomatology, as well as personal characteristics, family background variables, and a large array of contextual variables that pertain to a respondent’s county and state of residence. To address data confidentially and security issues and to minimize the potential for interviewer or parental influence, respondents entered their answers on a laptop computer. For particularly sensitive questions, the respondent listened to prerecorded questions through earphones and then directly entered the answers. Of the 18,924 respondents in the wave I in-home weighed sample, 13,569 were reinterviewed at follow-up. An additional 501 observations were lost because of missing data (table 1).

Effect of Smoking on Depressive Symptomatology

463

TABLE 1. Variable means and percentages, by gender, National Longitudinal Study of Adolescent Health, 1995–1996* Males (n ¼ 6,320)

Females (n ¼ 6,748)

CES-Dy score at baseline (mean (SDy))

10.96 (7.24)

13.00 (8.68)

CES-D score at follow-up (mean (SD))

10.75 (7.45)

12.96 (8.7)

Smoker at baseline (%)

25

26

Smoker at follow-up (%)

33

34

Packs per month at baselinez (mean (SD))

9.33 (12.71)

7.74 (11.08)

Packs per month at follow-upz (mean (SD))

10.02 (14.11)

8.67 (11.92)

Age in years at baseline (mean (SD))

15.62 (1.64)

15.46 (1.58)

Age in years at follow-up (mean (SD))

16.53 (1.64)

16.37 (1.59)

Disability (%)

2

3

White

71

71

Black

15

16

Other race

14

13

Hispanic

12

12

Two-parent home

64

64

Welfare receipt

8

9

Alcoholic parent

14

15

No high school

10

10

High school

23

25

Some college

29

27

College

16

15

Race/ethnicity variables (%)

Parental education (%)

Professional degree

12

13

Education missing

10

11

Unemployment

0.07 (0.02)

0.07 (0.02)

Rural

0.28 (0.28)

0.29 (0.28)

Urban

0.60 (0.39)

0.60 (0.39)

County-level variables (mean (SD))

* Sample weights were used in the calculations. There are 18,924 adolescents in the weighted baseline in-home survey. Of these, 13,569 were reinterviewed at follow-up. An additional 501 observations were lost because of missing information on smoking habits, CES-D score, age, or county data. The smoker variables are dichotomous and equal to one for youth who smoked at least one cigarette in the 30 days prior to being interviewed and zero otherwise. The packs-per-month variables are equal to the number of cigarettes smoked in the past 30 days divided by 20. The variable, disability, is dichotomous and equal to one for youth that indicated in the baseline interview that they have difficulty using their hands, arms, legs, or feet because of a permanent physical condition and zero otherwise. The variables White, Black, and other race are dichotomous, mutually exclusive, and based on a series of questions from the baseline in-home interview. Hispanic is a dichotomous variable equal to one for youth who selfidentified as being of Hispanic origin and zero otherwise. The household variables are also dichotomous and come from the questionnaire completed by the respondent’s parent/caregiver at baseline. The two-parent home variable indicates that the parent/caregiver who completed the Adolescent Health parent survey lived with a spouse or partner at the baseline interview. The welfare variable is equal to one if the parent/caregiver who completed the Adolescent Health parent survey receives public assistance and zero otherwise. The alcoholic parent variable is equal to one if either biologic parent currently suffers from alcoholism according to the baseline parent survey and zero otherwise. The parental education variables are dichotomous, mutually exclusive, and based on the responses pertaining to the best-educated parent/caregiver living in the household. The county controls are taken from the Adolescent Health contextual files. The unemployment variable is the county unemployment rate. The rural and urban variables are the fraction of the county’s population that resides in urbanized or rural areas, respectively. y CES-D, Center for Epidemiologic Studies Depression; SD, standard deviation. z Calculated for smokers only. There are 1,554 male and 1,593 female smokers at baseline and 1,979 male and 2,025 female smokers at follow-up.

Am J Epidemiol 2005;162:461–470

Downloaded from http://aje.oxfordjournals.org/ at aurarialibrary on July 9, 2013

Household variables (%)

464 Duncan and Rees

modifying equation 1 to include individual-specific intercepts, often called ‘‘fixed effects’’ (18). These fixed effects, denoted mi, are incorporated into equation 1 as: CES-Di; t ¼ a þ bX i; t þ p1 si; t þ p2 ppmi; t þ mi þ ei; t ;

TABLE 2. Mean follow-up Center for Epidemiologic Studies Depression scores for baseline smokers and nonsmokers, by gender, National Longitudinal Study of Adolescent Health, 1995–1996*

ð2Þ

Males

i ¼ 1; .. .; n; t ¼ 1;2: Score

Pi; t¼2 ¼

expðbX i; t¼1 þ psi; t¼1 Þ ; i ¼ 1; .. .; n; 1 þ expðbX i; t¼1 þ psi; t¼1 Þ

ð3Þ

where Xi, t¼1 and Si, t¼1 are defined as they are in equation 1. Solving equation 3 leads to the log odds ratio:   Pi; t¼2 ¼ bX i; t¼1 þ psi; t¼1 ; i ¼ 1; .. .; n: ð4Þ log 1  Pi; t¼2 The parameter p represents the effect of baseline smoking participation on the follow-up log odds ratio (controlling for the variables included in the vector Xi, t¼1). The parameters in equation 4 are estimated by maximizing the log likelihood function:   n  X expðzi; t¼1 Þ ð5Þ di; t¼2 log log L ¼ 1 þ expðzi; t¼1 Þ i¼1   expðzi; t¼1 Þ ; þ ð1  di; t¼2 Þlog 1  1 þ expðzi; t¼1 Þ where di, t is a dichotomous variable equal to one if i scores above the depressive cutpoint at interview t, and zi; t ¼ bXi; t þ psi; t : However, if smoking behavior is correlated with unobservable factors, then the parameter estimates from equation 5 will be biased in the same way that the ordinary least squares estimates are biased. One solution is to once again take advantage of the longitudinal nature of the Adolescent Health data by modifying equation 4 to include individual-specific

Smokers at baseline (A)y

Females Score

95% confidence interval

13.07 12.65, 13.50 15.41 14.95, 15.87

Nonsmokers at baseline (B)z

9.96

9.77, 10.16 12.10 11.87, 12.33

Difference between A and B

3.11

2.69, 3.52

* Sample weights sumed at least one interview. y Sample size: for z Sample size: for

3.31

2.84, 3.78

were used in the calculations. Smokers concigarette in the 30 days prior to the baseline males, n ¼ 1,554; for females, n ¼ 1,593. males, n ¼ 4,766; for females, n ¼ 5,155.

intercepts. Including individual fixed effects in the logistic model leads to a log odds ratio of the following form:   Pi; t ¼ mi þ bXi; t þ psi; t ; ð6Þ log 1  Pi; t i ¼ 1; . ..; n; t ¼ 1;2: In general, fixed effects binary choice models suffer from the incidental parameters problem (19). However, Chamberlain (19) shows that the parameters b and p in equation 6 can be estimated, without the need to estimate mi, by maximizing the conditional log likelihood function (19):   n  X expðzi; t¼2 Þ ð7Þ Ai log log L ¼ expðzi; t¼1 Þ þ expðzi; t¼2 Þ i¼1   expðzi; t¼1 Þ ; þ Bi log expðzi; t¼1 Þ þ expðzi; t¼2 Þ where zi, t is defined as it is in equation 5; Ai 2 f0, 1g is an indicator variable equal to one if i is nondepressed at baseline and depressed at follow-up; and Bi 2 f0, 1g is an indicator variable equal to one if i is depressed at baseline and nondepressed at follow-up. The fixed effects logistic model is analogous to the traditional fixed effects model in that it controls for both observed and unobserved time-invariant factors but does not control for unobserved time-variant factors. RESULTS

Basic descriptive statistics for the variables used in the analysis appear in table 1. Table 2 presents mean CES-D scores by smoking status at baseline. The mean follow-up score for males who smoked at baseline is 13.07 (95 percent confidence interval (CI): 12.65, 13.50), compared with a mean score of 9.96 (95 percent CI: 9.77, 10.16) for males who did not smoke at baseline. The mean follow-up CES-D Am J Epidemiol 2005;162:461–470

Downloaded from http://aje.oxfordjournals.org/ at aurarialibrary on July 9, 2013

Each respondent i contributes two observations to the estimation of this model, one from the baseline survey (t ¼ 1) and the other from the follow-up survey (t ¼ 2). Only the within-person variation is used to estimate the parameters of equation 2. All time-invariant factors (such as preexisting depression, basic personality traits, and family background) are controlled for by the fixed effects (18). However, fixed effects do not control for omitted time-variant factors, such as the respondent’s emotional state. If the correlation between smoking and CES-D scores is driven in part by difficult-to-measure factors that vary between the baseline and follow-up interviews, then this estimation strategy is likely to produce an upper-bound estimate of the effect of smoking on depressive symptomatology. The fixed effects approach can also be used when the dependent variable is dichotomous. Following previous research, we create a high depressive symptomatology variable that is equal to one for youth scoring above a cutpoint on the CES-D Scale, and zero otherwise (11). The cutpoint for females is 24; for males, the cutpoint is 22. The logistic model is derived by assuming that the probability that respondent i scores above the cutpoint at the follow-up interview takes the following form:

95% confidence interval

Effect of Smoking on Depressive Symptomatology

465

TABLE 3. Ordinary least squares regression results showing the effect of baseline smoking on the Center for Epidemiologic Studies Depression score at follow-up, by gender, National Longitudinal Study of Adolescent Health, 1995–1996* Males Marginal effect (increased score)

Smoker at baseline

2.85

Packs per month at baseline

0.06

Females 95% confidence intervaly

2.01, 3.68 0.046, 0.057

Marginal effect (increased score)

2.92 0.06

95% confidence intervaly

2.14, 3.71 0.0004, 0.116

Baseline controls Disability

3.03

1.26, 4.81

3.97

1.55, 6.38

Age (years) 14–15

1.55

0.85, 2.24

1.70

1.07, 2.32

16–17

1.67

0.91, 2.43

1.52

0.90, 2.13

18

3.17

2.03, 4.32

1.87

0.52, 3.23

Race/ethnicity variables 1.88

1.14, 2.63

2.00

1.12, 2.87

1.52

0.55, 2.49

1.87

0.82, 2.92

Hispanic

0.93

0.06, 1.79

1.59

0.43, 2.75

Household variables Two-parent home

0.11

0.57, 0.79

0.29

0.92, 0.33

Welfare receipt

1.20

0.33, 2.08

0.88

0.33, 2.09

Alcoholic parent

0.11

0.64, 0.85

1.16

0.38, 1.93

Parental education 1.40

0.39, 2.41

Some college

No high school

0.60

1.29, 0.1

0.20

0.93, 0.52

College

0.69

1.47, 0.09

1.00

1.87, 0.13

Professional degree

1.84

2.57, 1.11

1.98

Education missing

2.05

1.02, 3.09

2.76, 1.20

0.69

0.28, 1.66

0.74

0.39, 1.87

3.44

9.86, 16.75

4.99

8.33, 18.30

County-level variables Unemployment Rural

1.16

1.29, 3.61

1.28

3.81, 1.25

Urban

1.36

0.42, 3.14

1.49

3.30, 0.33

Constant

6.59

4.66, 8.52

11.02

R2

0.102

Sample size (no.)

6,320

8.56, 13.48

0.087 6,748

* Estimated coefficients are from ordinary least squared regressions. Sample weights were used in the calculations. The omitted age category is 11–13 years, the omitted race category is White, and the omitted parental education category is high school. Refer to the asterisk footnote of table 1 for definitions of the control variables. y Adjusted for clustering by school.

score for females who smoked at baseline is 15.41 (95 percent CI: 14.95, 15.87), compared with a mean score of 12.10 (95 percent CI: 11.87, 12.33) for females who did not smoke at baseline. Ordinary least squares estimates

Although statistically significant at conventional levels, these differences may simply be a reflection of factors that are easily observed and measured by the researcher. If so, they can be accounted for using a standard regression framework in which the CES-D score at follow-up is a function of Am J Epidemiol 2005;162:461–470

baseline smoking behavior and a set of controls also observable at baseline. Table 3 presents ordinary least squares estimates of the effect of smoking on depressive symptoms by gender. As noted above, ordinary least squares estimates are subject to a variety of limitations and are presented for comparison purposes only. Smoking participation by males is associated with a 2.85 increase in the CES-D score, and each additional pack per month is associated with an increase of 0.01 in the CES-D score, although this latter effect is not statistically significant at conventional levels. These estimates suggest that a male

Downloaded from http://aje.oxfordjournals.org/ at aurarialibrary on July 9, 2013

Black Other

466 Duncan and Rees

TABLE 4. Fixed effects regression results showing the effect of smoking on the Center for Epidemiologic Studies Depression score, by gender, National Longitudinal Study of Adolescent Health, 1995–1996* Males Marginal effect (increased score)

Smoker

0.66

Packs per month F statistic

0.69, 1.63 0.02, 0.05

0.01

0.32, 0.04

0.32

6.00

10.18

Suggest Documents