Poisson Regression for Rates

Poisson Regression for Rates > ############ Poisson Regression for Rates R code ######################### > # Treat lambda as the expected count per u...
Author: Ralf Pierce
2 downloads 2 Views 147KB Size
Poisson Regression for Rates > ############ Poisson Regression for Rates R code ######################### > # Treat lambda as the expected count per unit time or population > # t*exp(b0+b1X1+....) = exp(b0+b1X1+...+ln(t)) > # Note: No coefficient for the ln(t) term, it is "offset" > > # Number of lung cancer cases in 4 Danish cities 1968-1971. > # From: James K Lindsey (1995). Modelling frequency and count data, Clarendon Press, page 157. > # Reference: Andersen, E. B. (1977). > # Multiplicative Poisson models with unequal cell rates, > # Scandinavian Journal of Statistics, 4, pages 153–158. > # age column: midpoint of Age column, with 75 chosen for >74. > > lungs attach( lungs ) > lungs Cases Pop Age City age 1 11 3059 40-54 Fredericia 47 2 11 800 55-59 Fredericia 57 3 11 710 60-64 Fredericia 62 4 10 581 65-69 Fredericia 67 5 11 509 70-74 Fredericia 72 6 10 605 >74 Fredericia 75 7 13 2879 40-54 Horsens 47 8 6 1083 55-59 Horsens 57 9 15 923 60-64 Horsens 62 10 10 834 65-69 Horsens 67 11 12 634 70-74 Horsens 72 12 2 782 >74 Horsens 75 13 4 3142 40-54 Kolding 47 14 8 1050 55-59 Kolding 57 15 7 895 60-64 Kolding 62 16 11 702 65-69 Kolding 67 17 9 535 70-74 Kolding 72 18 12 659 >74 Kolding 75 19 5 2520 40-54 Vejle 47 20 7 878 55-59 Vejle 57 21 10 839 60-64 Vejle 62 22 14 631 65-69 Vejle 67 23 8 539 70-74 Vejle 72 24 7 619 >74 Vejle 75 > fit summary(fit) Call: glm(formula = Cases ~ age + offset(log(Pop)), family = poisson, data = lungs) Deviance Residuals: Min 1Q Median 3Q Max -4.2389 -0.4559 0.1840 0.8213 2.0335

1

Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -8.188971 0.421262 -19.439 abline(h=0) > > fit2 summary(fit2)$coef Estimate Std. Error z value Pr(>|z|) (Intercept) -21.434384043 3.0887571330 -6.939485 3.935317e-12 age 0.501159509 0.1020193966 4.912394 8.997086e-07 I(age^2) -0.003633982 0.0008289412 -4.383884 1.165821e-05 > anova(fit2, test="Chisq") Analysis of Deviance Table Model: poisson, link: log Response: Cases Terms added sequentially (first to last) Df Deviance Resid. Df Resid. Dev Pr(>Chi) NULL 23 129.908 age 1 79.210 22 50.698 < 2.2e-16 *** I(age^2) 1 19.732 21 30.966 8.911e-06 ***

2

> dev.new() > plot(fitted(fit2),residuals(fit2,type="pearson"),ylab="Pearson residual", + xlab="Predicted Cases")

> abline(h=0) > > fitCity summary(fitCity)$coef Estimate Std. Error z value Pr(>|z|) (Intercept) -21.451864835 3.0889482407 -6.944715 3.792254e-12 age 0.509650433 0.1021173677 4.990830 6.012036e-07 I(age^2) -0.003701243 0.0008297504 -4.460670 8.170370e-06 CityHorsens -0.332668142 0.1814706365 -1.833179 6.677597e-02 CityKolding -0.375524185 0.1877886250 -1.999717 4.553079e-02 CityVejle -0.274024990 0.1878413800 -1.458811 1.446173e-01 > anova(fitCity, test="Chisq") Analysis of Deviance Table Model: poisson, link: log Response: Cases Terms added sequentially (first to last) Df Deviance Resid. Df Resid. Dev Pr(>Chi) NULL 23 129.908 age 1 79.210 22 50.698 < 2.2e-16 *** I(age^2) 1 19.732 21 30.966 8.911e-06 *** City 3 4.949 18 26.017 0.1756 --Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 >

3

> # Use fit2 and find the predicted number of cases for the first row of data > fitted( fit2 )[1] 1 8.318571 > > # Show how this was obtained > fit2$coef # coefficients (Intercept) age I(age^2) -21.434384043 0.501159509 -0.003633982 > lungs[1,] # explanatory data Cases Pop Age City age 1 11 3059 40-54 Fredericia 47 > 3059* exp( -21.4344 + 0.50116*47 - 0.003634*47^2 ) [1] 8.318295 > > # Mathematical equivalent > exp( -21.4344 + 0.50116*47 - 0.003634*47^2 + log( 3059 ) ) [1] 8.318295 >

4

> > > > > >

# Plot expected number of cases per 100,000 population ages > > >

ecount dev.new() > plot(age, Cases/Pop); title(main="See what's driving the quadratic")

>

5

############ Poisson Regression for Rates R code ######################### # Treat lambda as the expected count per unit time or population # t*exp(b0+b1X1+....) = exp(b0+b1X1+...+ln(t)) # Note: No coefficient for the ln(t) term, it is "offset" # # # # # #

Number of lung cancer cases in 4 Danish cities 1968-1971. From: James K Lindsey (1995). Modelling frequency and count data, Clarendon Press, page 157. Reference: Andersen, E. B. (1977). Multiplicative Poisson models with unequal cell rates, Scandinavian Journal of Statistics, 4, pages 153–158. age column: midpoint of Age column, with 75 chosen for >74.

lungs

Suggest Documents