Unit 5 Logistic Regression Practice Problems. SOLUTIONS Version SAS

PubHlth640 - Spring 2012 Intermediate Biostatistics Page 1 of 7 Unit 5 – Logistic Regression Practice Problems SOLUTIONS Version SAS Source: Afifi...
Author: Shona Farmer
5 downloads 1 Views 170KB Size
PubHlth640 - Spring 2012

Intermediate Biostatistics

Page 1 of 7

Unit 5 – Logistic Regression Practice Problems SOLUTIONS Version SAS

Source: Afifi A., Clark VA and May S. Computer Aided Multivariate Analysis, Fourth Edition. Boca Raton: Chapman and Hall, 2004. Exercises #1-#3 utilize a data set provided by Afifi, Clark and May (2004). The data are a study of depression and was a longitudinal study. The purpose of the study was to obtain estimates of the prevalence and incidence of depression and to explore its risk factors. The study variables were of several types – demographics, life events, stressors, physical health, health services utilization, medication use, lifestyle, and social support. These exercises use just a subset of these data. I have provided them to you in three formats:: Stata (depress.dta), SAS (depress.sas7bdat), and Excel (depress.xls). http://people.umass.edu/~biep640w/webpages/assignments.html

Consider the following three variables. Variable DRINK SEX CASES

Codings 1 = yes 2 = no 1 = male 2 = female 0 = Normal 1 = Case of Depression

Format in SAS DRINK SEX CASES

Label in STATA DRINK SEX CASES

1. Source: Afifi A., Clark VA and May S. Computer Aided Multivariate Analysis, Fourth Edition. Boca Raton: Chapman and Hall, 2004, Problem 12.9, page 330. Use Stata or SAS or EXCEL , load the depression data set and fill in the following table: Sex Regular Drinker Yes No Total

Female 139 44 183

Male 95 16 111

Total 234 60 294

What are the odds that a woman is a regular drinker? 139 / 44 = 3.2 What are the odds that a man is a regular drinker? 95 / 16 = 5.9 What is the odds ratio? That is, compared to a man, what is the relative odds (odds ratio) that a woman is a regular drinker? OR = [odds for woman] / [odds for man] = 3.2/5.9 = 0.54

Sol_logistic_sas.doc

PubHlth640 - Spring 2012

Intermediate Biostatistics

Page 2 of 7

2. Repeat the tabulation that you produced for problem #1 two times, one for persons who are depressed and the other for persons who are not depressed.

Among Persons Who are Depressed Sex Regular Drinker Yes 33 No 7 Total 40

Female

Male 8 2 10

Total 41 9 50

‘ OR (Relative odds, compared to a man, that a woman is a regular drinker): OR = [(33)(2)] / [(7)(8) ] = 1.18

Among Persons Who are NOT Depressed Sex Regular Drinker Yes 106 No 37 Total 143

Female

Male 87 14 101

Total 193 51 244

OR (Relative odds, compared to a man, that a woman is a regular drinker): OR = [(106)(14)] / [(37)(87)] = 0.46 3. Fit a logistic regression model using these variables. Use DRINK as the dependent variable and CASES and SEX as independent variables. Also include as an independent variable the appropriate interaction term. Fitted Model: logit [ pr (drinker=yes) ] = 1.8269 - 0.4406 [ CASES] - 0.7743[ FEMALE ] + 0.9386 [ FEM_CASE ] where CASES =1 if depressed; 0 otherwise FEMALE = 1 if female; 0 otherwise FEM_CASE = (CASES) * (FEMALE)

ˆ ˆ ) = 0.96 and p-value = .33 Is the interaction term in your model significant? No. βˆ 3 = 0.9386 SE(β 0 How does your answer to problem #3 compare to your answer to problem #2? Comment. The answers match. Among Depressed: OR = 1.18 Among NON-depressed: OR = 0.46

Sol_logistic_sas.doc

PubHlth640 - Spring 2012

Intermediate Biostatistics

Page 3 of 7

logit [ pr (drinker=yes) ] = 1.8269 - 0.4406 [ CASES] - 0.7743 [ FEMALE ] + 0.9386 [ FEM_CASE ]

CASES FEMALE FEM_CASE

Among Depressed “1” = Female “0” = Male 1 1 1 0 1 0

logit [ female ] = 1.8269 – 0.4406 – 0.7743 + 0.9386 = 1.5506 logit [male] = 1.8269 – 0.4406 = 1.3863 logit [ female ] - logit [ male ] = 1.5506 - 1.3863 = + 0.1643 OR [women compared to men ] = exp { logit [ p1 ] - logit [ p0 ] } = exp { + 0.1643 } = 1.1786

CASES FEMALE FEM_CASE

Among NON Depressed “1” = Female “0” = Male 0 0 1 0 0 0

logit [ female ] = 1.8269 – 0.7743 = 1.0526 logit [male] = 1.8269 logit [ female ] - logit [ male ] = 1.0526 - 1.8269 = -0.7743 OR [women compared to men ] = exp { logit [ p1 ] - logit [ p0 ] } = exp { -0.7743 } = 0.4610

Sol_logistic_sas.doc

PubHlth640 - Spring 2012

Intermediate Biostatistics

Page 4 of 7

For SAS Users *_______________________________________________ * * Tell SAS location of data *________________________________________________; libname class "Z:\bigelow\teaching\web640\data sets";

You will have to edit this to be your path

*_________________________________________________ * * Read data of interest into a temporary copy *__________________________________________________; data temp(keep=drink sex cases); set class.depress; run; *______________________________________________________________ * * Create indicators as needed and format values for readability *_____________________________________________________________; proc format; value drinkf 0='0=nondrinker' 1='1=drinker'; value casef 0='0=normal' 1='1=depressed'; value sexf 0='0=male' 1='1=female'; run; data temp(drop=drink sex); set temp; drink01=.; if drink=1 then drink01=1; else if drink=2 then drink01=0; format drink01 drinkf. ; female=.; if sex=2 then female=1; else if sex=1 then female=0; format female sexf.; format cases casef.; fem_case = female*cases; run; *___________________________________________________________ * * Descriptives *_________________________________________________________; proc freq data=temp; tables drink01 female cases fem_case; run; *______________________________________________ * * Logistic regression model * NOTE - SAS chooses as the event the lower value * Use option DESCENDING so the value=1 is the event * of interest *_____________________________________________; proc logistic data=temp descending; model drink01 = cases female fem_case; run;

Sol_logistic_sas.doc

PubHlth640 - Spring 2012

Intermediate Biostatistics

Partial listing of Output Response Profile Ordered Value

Total Frequency

drink01

1 2

1=drinker 0=nondrinker

234 60

Probability modeled is drink01='1=drinker'.

The LOGISTIC Procedure Analysis of Maximum Likelihood Estimates

Parameter

DF

Estimate

Standard Error

Wald Chi-Square

Pr > ChiSq

Intercept CASES female fem_case

1 1 1 1

1.8269 -0.4406 -0.7743 0.9386

0.2880 0.8414 0.3455 0.9579

40.2469 0.2742 5.0223 0.9602