Introduction to Structural Equations with Latent Variables

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this site. Copyright 2007, The Johns Hopkins University and Qian-Li Xue. All rights reserved. Use of these materials permitted only in accordance with license rights granted. Materials provided AS IS ; no representations or warranties provided. User assumes all responsibility for use, and all liability related thereto, and must independently review all materials for accuracy and efficacy. May contain materials owned by others. User is responsible for obtaining permissions for use from third parties as needed.

Introduction to Structural Equations with Latent Variables Statistics for Psychosocial Research II: Structural Models Qian-Li Xue

No Ordinary Regression Test of causal hypotheses? Yes SEM (Origin: Path Models) Yes Continuous endogenous var. and Continuous LV? Yes No Categorical indicators and Categorical LV? No Classic SEM Latent Class Reg. Latent Trait Yes Longitudinal Data? No Latent Profile Adv. SEM I: latent growth curves) Yes Multilevel Data? No Adv. SEM II: Multilevel Models Classic SEM

Adding Latent Variables to the Model So far we ve only included observed variables in our path models Etension to latent variables: need to add in a measurement piece how do we define the latent variable (think factor analysis) more complicated to look at, but same general principles apply

Path Diagram ζ ξ γ η γ 2 β 2 η 2 ζ 2

Notation for Latent Variable Model η = latent endogenous variable (eta) ξ = latent eogenous variable (ksi, pronounced kah-see ) ζ = latent error (zeta) β = coefficient on latent endogenous variable (beta) γ = coefficient on latent eogenous variable (gamma) η = γ ξ + ζ η = β η + γ 2 2 2 ξ + ζ e.g. ξ is childhood home environment, η is social support, and η 2 is cancer coping ability η = Βη + Γξ + ζ, Cov( ξ ) = Φ, Cov( ζ ) 2 = Ψ

Path Diagram δ ε δ 2 2 δ 3 3 2 3 ζ ξ γ η γ 2 β 2 4 5 6 y y 2 ε 2 y 3 ε 3 η 2 7 9 8 y 4 y 5 ε 4 ε 5 ζ 2 y 6 ε 6

Notation for Measurement Model y = observed indicator of η = observed indicator of ξ ε = measurement error on y δ = measurement error on = coefficient relating to ξ y = coefficient relating y to η = ξ + δ = ξ + δ 2 2 2 = ξ + δ 3 3 3 y y y = η + ε 4 = η + ε 2 5 2 = η + ε 3 6 3 y y y = η + ε 4 7 2 4 = η + ε 5 8 2 5 = η + ε 6 9 2 6 y = Λ y η + ε, Cov( ε ) = Θ ε = Λ ξ + δ, Cov( δ ) = Θ δ

Eample: Model Specification ζ ζ 2 β 2 η η 2 γ γ 2 η 2 is democracy in 965 η is democracy in 960 ξ is industrialization in 960 ξ η η 2 = 0 β 2 0 η 0 η 2 + γ γ 2 ζ ζ 2 [ ] ξ + η = Βη + Γξ + ζ, Cov( ξ ) = Φ, Cov( ζ ) = Ψ

Eample: Model Specification ε ε 2 ε 3 ε 4 ε 5 ε 6 ε 7 ε 8 y y 2 y 3 y 4 y 5 y 6 y 7 y 8 ζ 3 4 2 3 4 2 Y,Y 5 : freedom of press β 2 η Y 2,Y 6 : freedom of group opposition η 2 Y 3,Y 7 : fairness of election Y 4,Y 8 : effectiveness of legislative γ γ 2 body ζ is GNP per capita ξ 2 2 is energy consumption per capita 6 7 3 is % labor force 2 3 δ δ 2 δ 3 Bollen pp322-323

Model Estimation In multiple regression, estimation is based on individual cases via LS, i.e. minimization of n ( Yˆ Y ) In SEM, estimation is based on covariances If Model is correct and θ is known Where Σ is the population covariance matri of observed variables and Σ(θ) is the covariance matri written as a function of θ 2 Σ = Σ(θ )

Model Estimation Regression analysis, confirmatory factor analysis are special cases E.g. y=γ+ζ, ζ γ, E(ζ)=0 VAR y) COV (, y) ( 2 VAR( ) γ VAR( ) + VAR( ζ ) = γvar( ) VAR( ) COV (, y) = γvar( ) γ = Does γ look familiar? Remember in SLR, β=( ) - y COV (, y) VAR( )

Model Estimation In reality, neither population covariances Σ nor the parameters θ are known What we have is sample estimate of Σ: S Goal: estimate θ based on S by choosing the estimates of θ such that Σˆ is as close to S as possible But how close is close? Define and minimize objective functions or called fitting functions : F(S, Σˆ ) e.g. F(S, Σˆ )=S- Σˆ

Common Fitting Functions Maimum Likelihood (ML) To minimize F ML =(/2)tr({[S-Σ(θ)] Σ - (θ)} 2 ) or F ML =log Σ(θ) +tr(sσ - (θ))-log S -(p+q) Eplicit solutions of θ may not eist Iterative numeric procedure is needed Asymptotic properties of ML estimators: Consistent, i.e. θˆ θas n Efficient, i.e. smallest asymptotic variance Asymptotic normality

Common Fitting Functions Maimum Likelihood (ML) Advantages Scale invariant F(S,Σ(θ)) if scale invariance if F(S,Σ(θ)) = F(DSD,DΣ(θ)D), where D is a diagonal matri with positive elements E.g. D consists of inverses of standard deviation of observed variables, DSD becomes a correlation matri More general, the value of F is the same for any change of scale (e.g. dollars to cents) Scale free Knowing D, we can calculate θˆ * (based on transformed data) from θˆ (based on non-transformed data) without actually rerunning the model Test of overall model fit for overidentified model based on the fact: (N-)F ML is a χ 2 distribution with ½(p+q)(p+q+)-t Disadvantage Assumption of multivariate normality

Common Fitting Functions Unweighted Least Squares (ULS) To minimize F ULS =(/2)tr[(S-Σ(θ)) 2 ] Analogous to OLS, minimize the sum of squares of each element in the residual matri (S-Σ(θ)) Give greater weights to off covariance terms than variance terms Eplicit solutions of θ may not eist Iterative numeric procedure is needed Advantages of ULS estimators: Intuitive Consistent, i.e. θˆ θas n No distributional assumptions Disadvantages Not most efficient Not scale invariant, not scale free

Common Fitting Functions Generalized Least Squares (GLS) To minimize F GLS =(/2)tr({[S-Σ(θ)]S - } 2 ) Weights the elements of (S- Σ(θ)) according to variances and covariances F ULS is a special case of F GLS with S - =I Advantages of ULS estimators: Intuitive Consistent, i.e. θˆ θas n Asymptotic normality (availability of significance test) Asymptotically efficient Scale invariant and scale free Test of overall model fit for overidentified model based on the fact: (N-)F GLS is a χ 2 distribution with ½(p+q)(p+q+)-t Disadvantages Sensitive to fat or thin tails

Eample: Model Specification ε ε 2 ε 3 ε 4 ε 5 ε 6 ε 7 ε 8 y y 2 y 3 y 4 y 5 y 6 y 7 y 8 3 4 2 3 4 2 Y,Y 5 : freedom of press β 2 η Y 2,Y 6 : freedom of group opposition η 2 Y 3,Y 7 : fairness of election Y 4,Y 8 : effectiveness of legislative γ γ 2 body is GNP per capita ξ 2 is energy consumption per capita 6 7 3 is % labor force 2 3 δ δ 2 δ 3 Bollen pp322-323

Simple Case of SEM with Latent Variables: Confirmatory Factor Analysis

Recap of Basic Characteristics of Eploratory Factor Analysis (EFA) Most EFA etract orthogonal factors, which is boring to SEM users Distinction between common and unique variances EFA is underidentified (i.e. no unique solution) Remember rotation? Equally good fit with different rotations! All measures are related to each factor

Confirmatory Factor Analysis (CFA) Takes factor analysis a step further. We can test or confirm or implement a highly constrained a priori structure that meets conditions of model identification But be careful, a model can never be confirmed!! CFA model is constructed in advance number of latent variables ( factors ) is pre-set by analyst (not part of the modeling usually) Whether latent variable influences observed is specified Measurement errors may correlate Difference between CFA and the usual SEM: SEM assumes causally interrelated latent variables CFA assumes interrelated latent variables (i.e. eogenous)

Eploratory Factor Analysis Two factor model: = Λ ξ + δ 2 3 4 5 6 = 2 3 4 5 6 2 22 32 42 52 62 ξ ξ 2 + δ δ 2 δ 3 δ 4 δ 5 δ 6 ξ ξ 2 2 3 4 5 6 δ δ 2 δ 3 δ 4 δ 5 δ 6

CFA Notation + = 6 5 4 3 2 2 62 52 42 3 2 6 5 4 3 2 0 0 0 0 0 0 δ δ δ δ δ δ ξ ξ Two factor model: = + Λ ξ δ δ δ 2 δ 3 δ 4 δ 5 δ 6 2 3 4 5 6 ξ 2 ξ

For the matri-challenged CFA = ξ + δ = ξ + δ 2 2 2 = ξ + δ 3 3 3 = ξ + δ 4 42 2 4 = ξ + δ 5 52 2 5 = ξ + δ 6 62 2 6 cov( ξ, ξ ) = ϕ 2 2 = ξ + ξ + δ 2 2 = ξ + ξ + δ 2 2 22 2 2 = ξ + ξ + δ 3 3 32 2 3 = ξ + ξ + δ 4 4 42 2 4 = ξ + ξ + δ 5 5 52 2 5 = ξ + ξ + δ 6 6 62 2 6 ξ ξ2 = 0 cov(, ) EFA

More Important Notation Φ (capital of φ): covariance matri of factors Φ = = ϕ ϕ2 var( ξ) ϕ ϕ 2 22 var( ξ ) = ϕ cov( ξ, ξ ) = ϕ 2 2 Ψ(capital of ): covariance matri of errors 2 3 Ψ = var( Δ) = 4 5 6 22 32 42 52 62 33 43 53 63 44 54 64 55 65 66 var( δ ) cov( δ, δ ) = 2 = 2 NOTE: usually, ij = 0 if i j

Model Estimation In our previous CFA eample, we have si equations, and many more unknowns In this form, not enough information to uniquely solve for and the factor correlation What if we multiple both sides of by ' = ( ξ + δ )( ξ + δ ) = ( ξ )( ξ ) + ( ξ ) δ + δ ( ξ ) + δδ becauseξ and δ are independent = ( ξ )( ξ ) + δδ = ξξ + δδ = Λ ξ + δ Σ = Φ + Ψ = Σ( Θ) where Φ is the covariance matri of factorsξ, and Ψ is error covariance matri

Model Constraints Hallmark of CFA Purposes for setting constraints: Test a priori theory Ensure identifiability Test reliability of measures

Model Constraints: Identifiability Latent variables (LVs) need some constraints Because factors are unmeasured, their variances can take different values Recall EFA where we constrained factors: F ~ N(0,) Otherwise, model is not identifiable. Here we have two options: Fi variance of latent variables (LV) to be (or another constant) Fi one path between LV and indicator

Necessary Constraints Fi variances: Fi path: ξ 2 δ δ 2 ξ 2 δ δ 2 ξ 2 3 4 δ 3 δ 4 ξ 2 3 4 δ 3 δ 4 5 δ 5 5 δ 5 6 δ 6 6 δ 6

Model Parametrization Fi variances: = ξ + δ = ξ + δ 2 2 2 = ξ + δ 3 3 3 = ξ + δ 4 42 2 4 = ξ + δ 5 52 2 5 = ξ + δ 6 62 2 6 cov( ξ, ξ ) var( ξ) = var( ξ ) = 2 2 2 = ϕ Fi path: = ξ + δ = ξ + δ 2 2 2 = ξ + δ 3 3 3 = ξ + δ 4 2 4 = ξ + δ 5 52 2 5 = ξ + δ 6 62 2 6 cov( ξ, ξ ) var( ξ ) var( ξ ) = 2 2 = = ϕ ϕ 2 22 ϕ

Model Constraints: Reliability Assessment Parallel measures T = T 2 [= E()] (True scores are equal) T affects and 2 equally Cov(δ, δ 2 ) = 0 (Errors not correlated) Var (δ ) = Var (δ 2 ) (Equal error variances) T 2 3 δ δ 2 δ 3 e e e

Model Constraints: Reliability Assessment Tau equivalent measures T = T 2 T affects and 2 equally Var (δ ) Var (δ 2 ) Note: for standardized measures, it makes no sense to constrain the loadings without also constraining the residuals, since Var() =.0 T 2 3 δ δ 2 δ 3 e e2 e3