Age-period-cohort models in epidemiology advantages and disadvantages

Age-period-cohort models in epidemiology advantages and disadvantages Eva Gelnarová Institute of Biostatistics and Analyses 1 Carcinoma diseases • ...
Author: Amanda Spencer
0 downloads 1 Views 269KB Size
Age-period-cohort models in epidemiology advantages and disadvantages Eva Gelnarová Institute of Biostatistics and Analyses

1

Carcinoma diseases • •

Nowadays - Second place among the causes of death (industrial developed countries) Half of the 20th century – seemingly epidemic appearance in Europe and North America. Civilization disease? »



Epidemiology of chronic diseases

Archeological findings – carcinoma always present in the population. E.g. Breast cancer (C50) ¾ papyruses from the period about 3 000 years B.C ¾ Hippocrates ¾ Galen 2

Factors for development of the disease

Occurrence of carcinoma differs in different part of the world, different races. 1. 2. 3.

Internal factors (genetically conditioned) External factors – e.g.carcinogenic chemicals exposure, life – style Age - number of spontaneous mutation increases, genetic instability In last fifty years Average age of the population rapidly increases

Number of cases in the population increases. 3

Incidence • Incidence is defined as a number of new cases in a population of size 100 000. • Incidence is a number of persons who develop a disease over number of persons (person - years) at risk multiplied by 100 000

incidence

Number of new cases Number of persons at risk

100 000

• Age-specific incidence – the investigated population includes only persons in a specific age category. incidence rate 4

Three time scales •

Task of interest What factors and how incidence depends on? Estimate the development of incidence via projection.



Because lack of additional information (long incubation period, e.g. historical pollution monitoring not available) so the factors are expressed as a function of time. 3 time scales: ¾ age of diagnosis ¾ year of diagnosis ¾ year of birth 5

Currently available information • • • • • • • •

Data sources: Oncological registries No individual records currently available Number of persons (person-years) at risk Number of persons who develop a disease over interval of time Gender, region, stage ... Age at diagnosis (age – a, number of age groups – A) Date of diagnosis (period – p, number of periods – P) Date of birth (cohort – c, number of cohorts – C = A – 1 + P ) ¾ Artificial Cohort is determined by age and period: c=p–a ¾ restricted version of a more general age-by-period cohort interaction

6

Lexis diagram Calendar time (Mean date of diagnosis) – PERIOD

Age group (Mean Age) AGE

1977 – 1981 (1979)

1982 – 1986 (1984)

1987 – 1991 (1989)

1992 – 1986 (1994)

1987 – 2001 (1999)

20 – 24 (22)

1957

1962

1967

1972

1977

25 - 29 (27)

1952

1957

1962

1967

1972

30 – 34 (32)

1947

1952

1957

1962

1967

35 – 39 (37)

1942

1947

1952

1957

1962

40 – 44 (42)

1937

1942

1947

1952

1957

45 – 49 (47)

1932

1937

1942

1947

1952

50 – 54 (52)

1927

1932

1937

1942

1947

55 – 59 (57)

1922

1927

1932

1937

1942

60 – 64 (62)

1917

1922

1927

1932

1937

65 – 69 (67)

1912

1917

1922

1927

1932

70 – 74 (72)

1907

1912

1917

1922

1927

75 – 79 (77)

1902

1907

1912

1917

1922

80 –84 (82)

1897

1902

1907

1912

1917

85 – 89 (87)

1892

1987

1902

1907

1912

7

Age – period – cohort models



Complex analysis of trends in incidence, widely used in • • •



Epidemiology Demography Sociology

Consider APC models which fall into a class of GLM (log-linear model).

8

Age – period – cohort models: Assumptions 1. The number of cases in age group i at time period j is denoted yij and is a realisation of poisson random variable with mean θij, where i = 1, ..., A and j = 1, ..., P. 2. The number of persons at risk in age group i at time period j Nij is fixed known value. 1. Random variables yij are jointly independent. 2. The logarithm of the expected rate is a linear function: ln(E[rij]) = ln(θij / Nij ) = µ + αi + ßj + γ k , where i = 1,…, A, j = 1, …, P, k = 1, …, C. 9

Effects description ln(E[rij]) = ln(θij / Nij ) = µ + αi + ßj + γ k , E[rij] = θij / Nij = exp(µ)¯exp(αi )¯exp(ßj )¯exp(γk), µ - mean effect αi - effect of age group ßj - effect of time period j and γk - the effect of the kth birth cohort

10

Effects description ln(E[rij]) = ln(θij / Nij ) = µ + αi + ßj + γ k , E[rij] = θij / Nij = exp(µ)¯exp(αi )¯exp(ßj )¯exp(γk), µ - mean effect αi - effect of age group differing risks associated with different age groups. ßj - effect of time period j and γk - the effect of th kth birth cohort

11

Effects description ln(E[rij]) = ln(θij / Nij ) = µ + αi + ßj + γ k , E[rij] = θij / Nij = exp(µ)¯exp(αi )¯exp(ßj )¯exp(γk), µ - mean effect αi - effect of age group differing risks associated with different age groups. ßj - effect of time period j and change in rate that is associated with all age groups simultaneously γk - the effect of th kth birth cohort

12

Effects description ln(E[rij]) = ln(θij / Nij ) = µ + αi + ßj + γ k , E[rij] = θij / Nij = exp(µ)¯exp(αi )¯exp(ßj )¯exp(γk), µ - mean effect αi - effect of age group differing risks associated with different age groups. ßj - effect of time period j and change in rate that is associated with all age groups simultaneously γk - the effect of th kth birth cohort long-term habits

13

Estimating parameters of age – period – cohort models Maximal Likelihood Estimates • Parameters of the model can be estimated with any statistical package with generalized linear modelling procedure • e.g. GLIM, R or SAS. • GLM must be fitted with constrains αa0 = ßp0 = γc0= 0. (*) plus other additional constrain or assumption needed.

• Fixed effect model • Possible refinements (natural splines etc.) Bayesian statistics • BAMP (“Bayesian Age-Period-Cohort Modeling and Prediction”), www.stat.uni-muenchen.dc/NSchmidt/bamp 14

Age – period – cohort models: Problems • Artificial birth cohorts form a sequence of overleaping intervals. (Usually ignored). • There is an exact linear dependency among three factors. ¾Only one cohort is associated with each cell in two-way table. ¾The design matrix is singular ¾Infinitely many solutions. ¾Identifiability problem.

15

Identifiability problem’s solutions • Usage of individual records tabulation, triangular tabulation,

no artificial rectangular cohorts hierarchical, random effect models

• Additional assumptions: the cohort or period trend is superior Sequential method (e.g. firts Age-cohort model, the residuals of this model fit on period) • Additional constrains: e.g. ßp0 = ßp1 = 0. • Holford’s method: with any parametrisation fit a model, regress the age-estimates on age (period estimates on period, cohort,…). Residuals represent the deviance from linearity. Linear trend cannot be uniquely assign to any of 3 time scales. • Intrinsic estimator • Age-period-cohort characteristic models (O’Brien, 2000): a cohort characteristic variable used instead of cohort dummy variables. 16

Effect of different parametrisations: Danish cancer testis data

17

Example: C50 breast-carcinoma • • • •

Only females A=13 age groups (30 years and older) 1977 – 2003, stratified into P=27 39 cohorts (C=P+A-1)

90

2003

90

300

300

80

1999 2001

300

80

1995 70

70

60

200

Rates

200

Rates

Rates

60

1997 1993 1989

50

1991 1985 1987 1983

200

50

1981 1977 1979

100

100

100

40

40 30

0 1880

1900

1920

1940

Date of birth

1960

30

0 1980

1985

1990

1995

Date of diagnosis

2000

2005

0 30

40

50

60

70

80

90

Age at diagnosis

18

Comparison: Holford, Sequential, “weighted” method • Cohort effect is wanted to be major

5

• Holford • „weighted“ • Sequential

0.05

10

0.1

20

0.2

Rate

50

0.5

1

100

2

200

Data fit • Cohort effect is major, • Period effect is marginal

30

50

70 Age

90

1900 1920

1940 1960 1980 2000 2020 Calendar time

19

Comparison: Holford, Sequential, “weighted” method

2 0.5

1

100 50

0.1

20

0.05

10 5

how to quantify? • Holford • „weighted“ • Sequential

0.2

Rate

Data fit • Period effect is not so major, • Cohort effect is as significant as period effect Both models: • Cohort effect is needed,

200

• Period effect is wanted to be major

30

50

70 Age

90

1900 1920

1940 1960 1980 2000 2020 Calendar time

20

Period major 2 1 0.5 0.2 0.1 0.05

5

5

10

0.05 0.1 10

20

Rate

50

0.2 Rate 0.5 20 50

100

1 100

2

200

200

Cohort major

30

50

70 Age



90

1900 1920

1940 1960 1980 2000 2020 Calendar time

Goodness of fit - residual deviance

30

50

70 Age

90

1900 1920

Resid. Df

•Age •Age-drift •Age-Cohort •Age-Period-Cohort •Age-Period

1940 1960 1980 2000 2020 Calendar time

343 342 336 330 336

Resid. Dev

4658.3 1058.4 720.8 642.1 1006.1

21

BAMP results

0 .0 0 - 0 .1 0 30 - 34

45 - 49

60 - 64

75 - 79

90 - 94

-3

-1

1

Age

1977

1980

1983

1986

1989

1992

1995

1998

2001

0 .0 0

0 .1 5

Period

- 0 .1 5

• Model (median) deviance: 360.724

Age Period and Cohort effects

1882 - 1887

1902 - 1907

1922 - 1927 Cohort

1942 - 1947

1962 - 1967

22

Conclusion

• Age-period-cohort models are appropriate to fit observed cancer incidence rates. • Because of identifiability problem, the results must be treated with caution. • To construct the model, we need an additional information, which is, unfortunately wanted as the model output.

23

Literature

• O’Brien Robert M. (2000): Age Period Cohort Characteristic Models, Social Science Research, 29, p.123-139

24

Suggest Documents