QMIN

Logistic Regression - 1

Psychology 5741 (Neuroscience) Logistic Regression Data Set: Logistic SAS input: Logistic.input.sas Background: The purpose of this study was to explore the action of a GABA (gaminobutyric acid) blocker on seizures. In most areas of the brain GABA is an inhibitory neurotransmitter, so blocking GABA might in theory lead to excitation and possibly seizures. Rats were given a dose of the blocker and then assessed over a 30 minute period for seizures. Afterwards, rats were sacrificed and their brains dissected and a measure of GABA receptors blocked by the drug was obtained—the larger the number the greater the number of receptors blocked (per unit volume). The data set also includes the sex of the rat (0 = female, 1 = male).

Background to Logistic Regression Logistic regression is used to predict two different types of dependent variables. The first type is a dichotomous dependent variable that takes on one of two mututally exclusive states. Examples of such a variable are “success versus fail,” “correct versus incorrect,” and “schizophrenic versus not schizophrenia.” The second type of dependent variable is an ordinal scale of response. Here, a study on schizophrenia might assign a value of 0 to those participants who lack appreciable schizophrenic pathology, a value of 1 to those with schizotypal personality but not full blown schizophrenia, and and value of 2 to schizophrenics. Ordinary regression computes a predicted value of a dependent variable as a linear function of a set of predictor (or independent) variables. The equation is Yˆ = b0 + b1 X1 + b2 X 2 + Kbk X k Logistic regression also begins with a linear function of the predictor (or independent) variables, but this linear function does not equal the predicted value of the dependent variables. Instead the linear function predicts a new variable that we will denote as L for liability towards the † dependent variable. Hence, the starting equation for logistic regression is L = b0 + b1 X1 + b2 X 2 + Kbk X k . Then the probability that the dependent variable takes on a specific state is a function of the liability dimension, L: exp(L) Pr(Y = State 1) = . † 1+ exp(L) In the current example, we want to predict the presence of a seizure from two variables in the data set, sex of the rat and the amount of GABA blocked. We begin by creating a model that predicts the liability of developing a seizure. We should be familiar with writing this type†of model because it is the one that we have used for ANOVA and

QMIN

Logistic Regression - 2

regression. We write L as a function of sex and GABA blocklage and allow for possibility of an interaction between sex and GABA blockage. The equation is L = b0 + b1 sex+ b2 GABA+ b3 sex* GABA . Hence, the probability that an animal has a seizure equals exp(L) Pr(Seizure) = . 1+ exp(L) †

SAS PROC LOGISTIC

† The text below shows a SAS program that performs the logistic regression detailed above. PROC LOGISTIC DATA=logistic; MODEL seizure = sex gababl sex*gababl; RUN;

Note that the model statement takes on the same syntax as model statement for PROC GLM or PROC REG. PROC LOGISTIC will automatically parse the MODEL statement and create the appropriate mathematical equations to solve for the unknown coefficients (i.e., the bs). Output from this procedure is given below. We will examine individual sections of the output and comment on them. The LOGISTIC Procedure Model Information Data Set Response Variable Number of Response Levels Number of Observations Model Optimization Technique

WORK.LOGISTIC seizure 2 121 binary logit Fisher's scoring

Response Profile Ordered Value

seizure

Total Frequency

1 2

0 1

77 44

Probability modeled is seizure=0.

This first section of the output provides descriptive information about the logistic regression by naming the data set, dependent variable, number of observations and other technical information about the method of analysis. Make certain to examine the table labeled “Response Profile.” The last line of this section (“Probability modeled is seizure

QMIN

Logistic Regression - 3

= 0”) gives the state of the dependent variable that the model is trying to predict. In the present case, we are predicting the absence if a seizure. (This should not be of concern because, as we see later, we only have to reverse the sign of the coefficients to predict the presence of a seizure).

Model Fit Statistics

Criterion

Intercept Only

Intercept and Covariates

160.627 163.422 158.627

134.870 146.053 126.870

AIC SC -2 Log L

Testing Global Null Hypothesis: BETA=0 Test Likelihood Ratio Score Wald

Chi-Square

DF

Pr > ChiSq

31.7570 27.3657 20.1104

3 3 3