Structural Equation Modeling with lavaan

Department of Data Analysis Ghent University Structural Equation Modeling with lavaan Yves Rosseel Department of Data Analysis Ghent University Summ...
Author: Ronald Marsh
88 downloads 1 Views 501KB Size
Department of Data Analysis

Ghent University

Structural Equation Modeling with lavaan Yves Rosseel Department of Data Analysis Ghent University Summer School – Using R for personality research August 23–28, 2014 Bertinoro, Italy

Yves Rosseel

Structural Equation Modeling with lavaan

1 / 126

Department of Data Analysis

Ghent University

Contents 1

Introduction to SEM 1.1 From regression to structural equation modeling . . . . . . . 1.2 The model-implied covariance matrix (the essence of SEM) . 1.3 Matrix representation in a CFA model . . . . . . . . . . . . 1.4 The implied covariance matrix for the full SEM model . . . 1.5 Model parameters and model matrices . . . . . . . . . . . . 1.6 Model estimation . . . . . . . . . . . . . . . . . . . . . . . 1.7 Model evaluation . . . . . . . . . . . . . . . . . . . . . . . 1.8 Model respecification . . . . . . . . . . . . . . . . . . . . . 1.9 Reporting your results . . . . . . . . . . . . . . . . . . . . . 1.10 Further reading . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

3 3 9 13 22 26 34 35 37 38 39

2

Introduction to lavaan

40

3

Further topics 3.1 Meanstructures . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Multiple groups . . . . . . . . . . . . . . . . . . . . . . . . . . .

62 62 66

Yves Rosseel

Structural Equation Modeling with lavaan

2 / 126

Department of Data Analysis

3.3 3.4 3.5 3.6 4

Ghent University

Measurement invariance . . . . . . . . . . Missing data . . . . . . . . . . . . . . . . . Nonnormal data and alternative estimators . Handling categorical endogenous variables

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

68 73 75 85

Appendices 98 4.1 Rules about variances and covariances . . . . . . . . . . . . . . . 98 4.2 lavaan: a brief user’s guide . . . . . . . . . . . . . . . . . . . . . 105 4.3 Estimator ML, MLM and MLR: robust standard errors and scaled test statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

Yves Rosseel

Structural Equation Modeling with lavaan

3 / 126

Department of Data Analysis

1 1.1

Ghent University

Introduction to SEM From regression to structural equation modeling

univariate linear regression 1 x1 x1

β0



β1

x2

β2 β3

x3

x2 y

y x3

β4

x4 x4 yi = β0 + β1 xi1 + β2 xi2 + β3 xi3 + β4 xi4 + i Yves Rosseel

Structural Equation Modeling with lavaan

(i = 1, 2, . . . , n) 4 / 126

Department of Data Analysis

Ghent University

multivariate regression

x1 x2

y1

x3

y2

x4

Yves Rosseel

Structural Equation Modeling with lavaan

5 / 126

Department of Data Analysis

Ghent University

path analysis • testing models of causal relationships among observed variables • all variables are observed (manifest) • system of regression equations

y1

y5

y2 y3 y4

Yves Rosseel

y6

Structural Equation Modeling with lavaan

y7

6 / 126

Department of Data Analysis

Ghent University

measurement error • in the social sciences, observed variables are not without measurement error • single indicator measurement model 

y

η

• multiple indicator measurement model

Yves Rosseel

1

y1

2

y2

3

y3

Structural Equation Modeling with lavaan

η

7 / 126

Department of Data Analysis

Ghent University

confirmatory factor analysis (CFA) • factor analysis: representing the relationship between one or more latent variables and their (observed) indicators y1 y2

η1

y3 y4 y5

η2

y6

Yves Rosseel

Structural Equation Modeling with lavaan

8 / 126

Department of Data Analysis

Ghent University

structural equation modeling (SEM) • path analysis with latent variables y7

y8

y9

y10

y11

y12

y1 y2

η1

η3

η4

y3 y4 y5

η2

y6 Yves Rosseel

Structural Equation Modeling with lavaan

9 / 126

Department of Data Analysis

1.2

Ghent University

The model-implied covariance matrix (the essence of SEM) • the goal of SEM is to test an a priori specified theory (which often can be depicted as a path diagram) • we may have several alternative models, each one with its own path diagram • each path diagram can be converted to a SEM: – measurement model (relationship latent variables and indicators) – structural equations (regressions among latent/observed variables) • each diagram has ‘model-based’ implications ˆ – for the model-implied covariance matrix: Σ ˆ – for the model-implied mean vector: µ – ... • different diagrams lead to (potentially) different implications; some implications may not fit with your data

Yves Rosseel

Structural Equation Modeling with lavaan

10 / 126

Department of Data Analysis

Ghent University

example model-implied covariance matrix (1) • suppose we have three observed variables, y1 , y2 and y3 ; to explain why they are correlated, we may postulate the following model: y1

a

y2

b

y3 • if we use the following values for the model parameters: a = 3 and b = 5, σ 2 (y1 ) = 10, σ 2 (2 ) = 20 and σ 2 (3 ) = 30, then we find (see Appendix):   10  ˆ = Σ  30 110  50 150 280

Yves Rosseel

Structural Equation Modeling with lavaan

11 / 126

Department of Data Analysis

Ghent University

example model-implied covariance matrix (2) • but if we change the path diagram (and keep the parameter values fixed), the model-implied covariance matrix will also change: y1

a

y2 b

y3 we find



 10  ˆ = Σ  30 110  150 550 2780

• two models are said to be equivalent, if they imply the same covariance matrix (but note that we did not estimate the parameters here) Yves Rosseel

Structural Equation Modeling with lavaan

12 / 126

Department of Data Analysis

Ghent University

example model-implied covariance matrix (3) • we can also postulate that the correlations among the three observed variables are explained by a common ‘factor’: y1 1

y2

a

η

b

y3 we find (using σ 2 (1 ) = 10, σ 2 (2 ) = 20, σ 2 (3 ) = 30, σ 2 (η) = 1):   11  ˆ = Σ  4 36  5 20 55

Yves Rosseel

Structural Equation Modeling with lavaan

13 / 126

Department of Data Analysis

1.3

Ghent University

Matrix representation in a CFA model

classic example CFA • well-known dataset; based on Holzinger & Swineford (1939) data • also analyzed by J¨oreskog (1969) • 9 observed ‘indicators’ measuring three ‘latent’ factors: – a ‘visual’ factor measured by x1, x2 and x3 – a ‘textual’ factor measured by x4, x5 and x6 – a ‘speed’ factor measured by x7, x8 and x9 • N=301 • we assume the three factors are correlated

Yves Rosseel

Structural Equation Modeling with lavaan

14 / 126

Department of Data Analysis

Ghent University

diagram of the model x1 x2

visual

x3 x4 x5

textual

x6 x7 x8

speed

x9

Yves Rosseel

Structural Equation Modeling with lavaan

15 / 126

Department of Data Analysis

Ghent University

how does the data look like? • our dataset contains measures for all observed variables in the model • in many SEM applications, we do not need the full dataset; all we need are summary statistics: – the sample-based variance/covariance matrix (S); if there are p variables, the covariance matrix is of size p × p and contains p(p + 1)/2 non-duplicated elements – the sample size (N) – for some SEM applications, we also need the sample-based p-dimensional mean vector • note that this is true for many statistical techniques (ANOVA, regression, factor analysis, . . . ) • but for more advanced applications, we need the full dataset anyway (e.g. if we wish to correct for non-normality, if we have missing values, . . . ) Yves Rosseel

Structural Equation Modeling with lavaan

16 / 126

Department of Data Analysis

Ghent University

data 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 ... 301

x1 3.3333333 5.3333333 4.5000000 5.3333333 4.8333333 5.3333333 2.8333333 5.6666667 4.5000000 3.5000000 3.6666667 5.8333333 5.6666667 6.0000000 5.8333333 4.6666667

x2 7.75 5.25 5.25 7.75 4.75 5.00 6.00 6.25 5.75 5.25 5.75 6.00 4.50 5.50 5.75 4.75

x3 0.375 2.125 1.875 3.000 0.875 2.250 1.000 1.875 1.500 0.750 2.000 2.875 4.125 1.750 3.625 2.375

x4 2.3333333 1.6666667 1.0000000 2.6666667 2.6666667 1.0000000 3.3333333 3.6666667 2.6666667 2.6666667 2.0000000 2.6666667 2.6666667 4.6666667 5.0000000 2.6666667

x5 5.75 3.00 1.75 4.50 4.00 3.00 6.00 4.25 5.75 5.00 3.50 4.50 4.00 4.00 5.50 4.25

x6 1.2857143 1.2857143 0.4285714 2.4285714 2.5714286 0.8571429 2.8571429 1.2857143 2.7142857 2.5714286 1.5714286 2.7142857 2.2857143 1.5714286 3.0000000 0.7142857

x7 3.391304 3.782609 3.260870 3.000000 3.695652 4.347826 4.695652 3.391304 4.521739 4.130435 3.739130 3.695652 5.869565 5.130435 4.000000 4.086957

4.3333333 6.00 3.375 3.6666667 5.75 3.1428571 4.086957

x8 5.75 6.25 3.90 5.30 6.30 6.65 6.20 5.15 4.65 4.55 5.70 5.15 5.20 4.70 4.35 3.80

x9 6.361111 7.916667 4.416667 4.861111 5.916667 7.500000 4.861111 3.666667 7.361111 4.361111 4.305556 4.138889 5.861111 4.444444 5.861111 5.138889

6.95 5.166667

• data is complete • the ‘covariance matrix’ contains all information about the interrelations among the observed variables Yves Rosseel

Structural Equation Modeling with lavaan

17 / 126

Department of Data Analysis

Ghent University

observed covariance matrix: S • p is the number of observed variables: p = 9 • observed covariance matrix (elements divided by N-1): x1 x2 x3 x4 x5 x6 x7 x8 x9

x1 x2 x3 x4 1.36 0.41 1.38 0.58 0.45 1.28 0.51 0.21 0.21 1.35 0.44 0.21 0.11 1.10 0.46 0.25 0.24 0.90 0.09 -0.10 0.09 0.22 0.26 0.11 0.21 0.13 0.46 0.24 0.37 0.24

x5

x6

1.66 1.01 0.14 0.18 0.30

1.20 0.14 0.17 0.24

x7

x8

x9

1.18 0.54 1.02 0.37 0.46 1.02

• we want to ‘explain’ the observed correlations/covariances by postulating a number of latent variables (factors) and a corresponding factor structure • we will ‘rewrite’ the p(p + 1)/2 = 45 elements in the covariance matrix as a function a smaller number of ‘free parameters’ in the CFA model, summarized in a number of (typically sparse) matrices Yves Rosseel

Structural Equation Modeling with lavaan

18 / 126

Department of Data Analysis

Ghent University

the standard CFA model: matrix representation • the classic LISREL representation uses three matrices (for CFA) • the LAMBDA matrix contains the ‘factor structure’:          Λ=        

x x x 0 0 0 0 0 0

0 0 0 x x x 0 0 0

0 0 0 0 0 0 x x x

                 

• the variances/covariances of the latent variables are summarized in the PSI matrix: Yves Rosseel

Structural Equation Modeling with lavaan

19 / 126

Department of Data Analysis

Ghent University



x  Ψ= x x

 x x

  x

• what we can not explain by the set of common factors (the ‘residual part’ of the model) is written in the (typically diagonal) matrix THETA:          Θ=        



x

                

x x x x x x x x

• note that we have only 24 parameters (of which 21 are estimable) Yves Rosseel

Structural Equation Modeling with lavaan

20 / 126

Department of Data Analysis

Ghent University

the standard CFA model: the model implied covariance matrix • in the standard CFA model, the ‘implied’ covariance matrix is: Σ = ΛΨΛ0 + Θ • all parameters are included in three model matrices • simple matrix multiplication (and addition) gives us the model implied covariance matrix • for identification purposes, some parameters need to be fixed to a constant • estimation problem: choose the ‘free’ parameters, so that the estimated imˆ is ‘as close as possible’ to the observed covariplied covariance matrix (Σ) ance matrix S – generalized (weighted) least-squares estimation (GLS, WLS) – maximum likelihood estimation (ML) – Bayesian approaches Yves Rosseel

Structural Equation Modeling with lavaan

21 / 126

Department of Data Analysis

Ghent University

observed covariance matrix x1 x2 x3 x4 x5 x6 x7 x8 x9

x1 x2 x3 x4 x5 x6 x7 x8 x9 1.358 0.407 1.382 0.580 0.451 1.275 0.505 0.209 0.208 1.351 0.441 0.211 0.112 1.098 1.660 0.455 0.248 0.244 0.896 1.015 1.196 0.085 -0.097 0.088 0.220 0.143 0.144 1.183 0.264 0.110 0.212 0.126 0.181 0.165 0.535 1.022 0.458 0.244 0.374 0.243 0.295 0.236 0.373 0.457 1.015

model-implied covariance matrix x1 x2 x3 x4 x5 x6 x7 x8 x9

x1 1.358 0.448 0.590 0.408 0.454 0.378 0.262 0.309 0.284

Yves Rosseel

x2

x3

x4

x5

x6

x7

x8

1.382 0.327 0.226 0.252 0.209 0.145 0.171 0.157

1.275 0.298 0.331 0.276 0.191 0.226 0.207

1.351 1.090 0.907 0.173 0.205 0.188

1.660 1.010 0.193 0.228 0.209

1.196 0.161 1.183 0.190 0.453 1.022 0.174 0.415 0.490 1.015

Structural Equation Modeling with lavaan

x9

22 / 126

Department of Data Analysis

1.4

Ghent University

The implied covariance matrix for the full SEM model • in the LISREL representation, we need an additional matrix (B): Σ = Λ(I − B)−1 Ψ(I − B)0−1 Λ0 + Θ where B summarizes the regressions among the latent variables

• we need this extended model for – second-order CFA – MIMIC models – SEM models • in LISREL parlance, this the ‘all-y’ model

Yves Rosseel

Structural Equation Modeling with lavaan

23 / 126

Department of Data Analysis

Ghent University

example: Political Democracy • Industrialization and Political Democracy dataset (N=75) • This dataset is used throughout Bollen’s 1989 book (see pages 12, 17, 36 in chapter 2, pages 228 and following in chapter 7, pages 321 and following in chapter 8). • The dataset contains various measures of political democracy and industrialization in developing countries: y1: y2: y3: y4: y5: y6: y7: y8: x1: x2: x3:

Yves Rosseel

Expert ratings of the freedom of the press in 1960 The freedom of political opposition in 1960 The fairness of elections in 1960 The effectiveness of the elected legislature in 1960 Expert ratings of the freedom of the press in 1965 The freedom of political opposition in 1965 The fairness of elections in 1965 The effectiveness of the elected legislature in 1965 The gross national product (GNP) per capita in 1960 The inanimate energy consumption per capita in 1960 The percentage of the labor force in industry in 1960

Structural Equation Modeling with lavaan

24 / 126

Department of Data Analysis

Ghent University

model diagram y1

x1

x2

x3

y2 y3

dem60

ind60

y4 y5 y6 dem65 y7 y8

Yves Rosseel

Structural Equation Modeling with lavaan

25 / 126

Department of Data Analysis

Ghent University

selection of the output Latent variables: ind60 =˜ x1 x2 x3 dem60 =˜ y1 y2 y3 y4 dem65 =˜ y5 y6 y7 y8 Regressions: dem60 ˜ ind60 dem65 ˜ ind60 dem60 ...

Yves Rosseel

Estimate

Std.err

Z-value

P(>|z|)

Std.lv

Std.all

1.000 2.180 1.819

0.139 0.152

15.742 11.967

0.000 0.000

0.670 1.460 1.218

0.920 0.973 0.872

1.000 1.257 1.058 1.265

0.182 0.151 0.145

6.889 6.987 8.722

0.000 0.000 0.000

2.223 2.794 2.351 2.812

0.850 0.717 0.722 0.846

1.000 1.186 1.280 1.266

0.169 0.160 0.158

7.024 8.002 8.007

0.000 0.000 0.000

2.103 2.493 2.691 2.662

0.808 0.746 0.824 0.828

1.483

0.399

3.715

0.000

0.447

0.447

0.572 0.837

0.221 0.098

2.586 8.514

0.010 0.000

0.182 0.885

0.182 0.885

Structural Equation Modeling with lavaan

26 / 126

Department of Data Analysis

1.5

Ghent University

Model parameters and model matrices

model parameters > coef(fit) ind60=˜x2 2.180 dem65=˜y7 1.280 y2˜˜y4 1.313 x2˜˜x2 0.120 y5˜˜y5 2.351 dem65˜˜dem65 0.172

Yves Rosseel

ind60=˜x3 1.819 dem65=˜y8 1.266 y2˜˜y6 2.153 x3˜˜x3 0.467 y6˜˜y6 4.954

dem60=˜y2 1.257 dem60˜ind60 1.483 y3˜˜y7 0.795 y1˜˜y1 1.891 y7˜˜y7 3.431

dem60=˜y3 dem60=˜y4 dem65=˜y6 1.058 1.265 1.186 dem65˜ind60 dem65˜dem60 y1˜˜y5 0.572 0.837 0.624 y4˜˜y8 y6˜˜y8 x1˜˜x1 0.348 1.356 0.082 y2˜˜y2 y3˜˜y3 y4˜˜y4 7.373 5.067 3.148 y8˜˜y8 ind60˜˜ind60 dem60˜˜dem60 3.254 0.448 3.956

Structural Equation Modeling with lavaan

27 / 126

Department of Data Analysis

Ghent University

model matrices: free parameters > inspect(fit) $lambda ind60 dem60 dem65 x1 0 0 0 x2 1 0 0 x3 2 0 0 y1 0 0 0 y2 0 3 0 y3 0 4 0 y4 0 5 0 y5 0 0 0 y6 0 0 6 y7 0 0 7 y8 0 0 8 $theta x1 x2 x3 y1 y2 y3 y4 y5 y6 y7 y8 x1 18 x2 0 19 x3 0 0 20 y1 0 0 0 21 y2 0 0 0 0 22 y3 0 0 0 0 0 23 y4 0 0 0 0 13 0 24 y5 0 0 0 12 0 0 0 25 Yves Rosseel

Structural Equation Modeling with lavaan

28 / 126

Department of Data Analysis

y6 y7 y8

0 0 0

0 0 0

0 0 0

Ghent University

0 14 0 0 0 0 15 0 0 0 0 16

0 26 0 0 27 0 17 0 28

$psi ind60 dem60 dem65 ind60 29 dem60 0 30 dem65 0 0 31 $beta ind60 dem60 dem65

Yves Rosseel

ind60 dem60 dem65 0 0 0 9 0 0 10 11 0

Structural Equation Modeling with lavaan

29 / 126

Department of Data Analysis

Ghent University

model matrices: estimated values > inspect(fit, "est") $lambda ind60 x1 1.000 x2 2.180 x3 1.819 y1 0.000 y2 0.000 y3 0.000 y4 0.000 y5 0.000 y6 0.000 y7 0.000 y8 0.000

dem60 0.000 0.000 0.000 1.000 1.257 1.058 1.265 0.000 0.000 0.000 0.000

dem65 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.000 1.186 1.280 1.266

$theta x1 x1 0.082 x2 0.000 x3 0.000 y1 0.000 y2 0.000 y3 0.000 y4 0.000 y5 0.000

x2

x3

y1

y2

0.120 0.000 0.000 0.000 0.000 0.000 0.000

0.467 0.000 0.000 0.000 0.000 0.000

1.891 0.000 0.000 0.000 0.624

7.373 0.000 5.067 1.313 0.000 3.148 0.000 0.000 0.000 2.351

Yves Rosseel

y3

y4

y5

Structural Equation Modeling with lavaan

y6

y7

y8

30 / 126

Department of Data Analysis

Ghent University

y6 0.000 0.000 0.000 0.000 2.153 0.000 0.000 0.000 4.954 y7 0.000 0.000 0.000 0.000 0.000 0.795 0.000 0.000 0.000 3.431 y8 0.000 0.000 0.000 0.000 0.000 0.000 0.348 0.000 1.356 0.000 3.254 $psi ind60 dem60 dem65 ind60 0.448 dem60 0.000 3.956 dem65 0.000 0.000 0.172 $beta ind60 dem60 dem65 ind60 0.000 0.000 0 dem60 1.483 0.000 0 dem65 0.572 0.837 0

Yves Rosseel

Structural Equation Modeling with lavaan

31 / 126

Department of Data Analysis

Ghent University

computing the model-implied covariance matrix (optional) # easy way is: fitted(fit) # manual attach(inspect(fit, "est")) IB

Suggest Documents