CFA Example Using Simulated Forgiveness of Situations (N = 1103)

The Forgiveness of Situations Subscale includes six items, three of which are reverse-coded, on a seven-point scale:

  1. When things go wrong for reasons that can’t be controlled, I get stuck in negative thoughts about it. (R)
  2. With time I can be understanding of bad circumstances in my life.
  3. If I am disappointed by uncontrollable circumstances in my life, I continue to think negatively about them. (R)
  4. I eventually make peace with bad situations in my life.
  5. It’s really hard for me to accept negative situations that aren’t anybody’s fault. (R)
  6. Eventually I let go of negative thoughts about bad circumstances that are beyond anyone’s control.

Response Anchors:

Note: the data for this example are simulated based on the results of the real data analysis. These results will be different than those reported in the other example file but are used to show you how to execute the syntax in this analysis.

Package Installation and Loading

For this example, we will be using the ggplot2 (for general plotting), melt2 (for data reshaping before plotting), and lavaan (for CFA) packages.

if (require(ggplot2)==FALSE){
  install.packages("ggplot2")
}
library(ggplot2)
if (require(reshape2) == FALSE){
  install.packages("reshape2")
}
if (require(lavaan) == FALSE){
  install.packages("lavaan")
}
library(lavaan)
if (require(psych) == FALSE){
  install.packages("psych")
}
if (require(knitr) == FALSE){
  install.packages("knitr")
}
if (require(kableExtra) == FALSE){
  install.packages("kableExtra")
}
if (require(semPlot) == FALSE){
  install.packages("semPlot")
}

Data Import into R

The data are in a text file named Study2.dat originally used in Mplus (so no column names were included at the top of the file). The file contains more items than we will use, so we select only the “Forgiveness of Situations” items from the whole file.

#syntax from previous example: commented to remove for simulation
# #read in data file (Mplus file format so having to label columns)
# hfsData = read.table(file = "Study2.dat", header = FALSE, na.strings = "99999", col.names = c("PersonID", "Self1", "Self2r", "Self3", "Self4r", "Self5", "Self6r", "Other1r", "Other2", "Other3r", "Other4", "Other5r", "Other6", "Sit1r", "Sit2", "Sit3r", "Sit4", "Sit5r", "Sit6", "Selfsub", "Othsub", "Sitsub", "HFSsum"))
# 
# #select Situations items and PersonID variables
# hfsSituations = hfsData[c("PersonID", "Sit1r", "Sit2", "Sit3r", "Sit4", "Sit5r", "Sit6")]
#setting seed for output constancy across machines
set.seed(04092017)
# simulating data for analysis based on results of model04estimates
model04Simulate = "
# SitP loadings (all estimated)
SitP =~ 1.007*Sit2 + 1.064*Sit4 + 0.956*Sit6
# SitN loadings (all estimated)
SitN =~ 1.325*Sit1r + 1.349*Sit3r + 1.009*Sit5r
# Unique Variances:
Sit1r ~~ 1.294*Sit1r; Sit2 ~~ 0.888*Sit2; Sit3r ~~ 0.724*Sit3r; Sit4 ~~ 0.835*Sit4; Sit5r ~~ 1.926*Sit5r; Sit6 ~~ 1.428*Sit6; 
# Item Intercepts:
Sit2 ~ 5.289*1
Sit4 ~ 5.359*1
Sit6 ~ 5.321*1
Sit1r ~ 4.547*1
Sit3r ~ 4.896*1
Sit5r ~ 4.860*1
# Factor Covariances
SitP ~~ .564*SitN
"
#generate data
hfsSituations = simulateData(model = model04Simulate, sample.nobs = 1103L, model.type = "sem")
#add ID variable
hfsSituations = data.frame(PersonID = 1:1103, hfsSituations)
#reorder variables to match data file
hfsSituations = hfsSituations[c("PersonID", "Sit1r", "Sit2", "Sit3r", "Sit4", "Sit5r", "Sit6")]

Observed Sample Statistics

Sample Correlation Matrix

The observed correlation matrix, rounded to three digits:

#here the c() function selects only the variables, not the PersionID variable
round(cor(hfsSituations[c("Sit1r", "Sit2", "Sit3r", "Sit4", "Sit5r", "Sit6")]), digits = 3)
      Sit1r  Sit2 Sit3r  Sit4 Sit5r  Sit6
Sit1r 1.000 0.360 0.658 0.348 0.450 0.284
Sit2  0.360 1.000 0.412 0.588 0.257 0.464
Sit3r 0.658 0.412 1.000 0.400 0.497 0.315
Sit4  0.348 0.588 0.400 1.000 0.277 0.440
Sit5r 0.450 0.257 0.497 0.277 1.000 0.193
Sit6  0.284 0.464 0.315 0.440 0.193 1.000

Sample Means and Variances

The observed means, rounded to three digits:

apply(X = hfsSituations[c("Sit1r", "Sit2", "Sit3r", "Sit4", "Sit5r", "Sit6")], MARGIN = 2, FUN = function(x) round(mean(x), digits = 3))
Sit1r  Sit2 Sit3r  Sit4 Sit5r  Sit6 
4.443 5.287 4.778 5.314 4.761 5.263 

The observed variances (using \(N\) in the denominator to match ML estimated output and Mplus example), rounded to three digits:

apply(X = hfsSituations[c("Sit1r", "Sit2", "Sit3r", "Sit4", "Sit5r", "Sit6")], MARGIN = 2, FUN = function(x) round(var(x, na.rm = TRUE)*1102/1103, digits = 3))
Sit1r  Sit2 Sit3r  Sit4 Sit5r  Sit6 
3.209 2.007 2.616 1.976 2.979 2.281 

Sample Covariances

To do a CFA analysis, you only really need means, variances, and either correlations or covariances among items. That said, modern methods of estimation use the raw data (often called full information) rather than the summary statistics as the raw data enable better missing data assumptions when using maximum likelihood and Bayesian estimation methods.

The sample covariance matrix can be found from the sample correlations and variances. Each covariance between a pair of variables \(y_1\) and \(y_2\) is denoted with a \(\sigma_{y_1, y_2}\) and each correlation is denoted with a \(\rho_{y_1, y_2}\). The variance of a variable is denoted by \(\sigma^2_{y_1}\) and the standard deviation of a variable is the square root of the variance \(\sqrt{\sigma^2_{y_1}}\). The covariance can be found by taking the correlation and multiplying it by the product of the standard deviations.

\[\sigma_{y_1, y_2} = \rho_{y_1, y_2}\sqrt{\sigma^2_{y_1}}\sqrt{\sigma^2_{y_2}}. \]

Inversely, the correlation can be found by taking the covariance and dividing it by the product of the standard deviations:

\[\rho_{y_1, y_2} = \frac{\sigma_{y_1, y_2}}{\sqrt{\sigma^2_{y_1}}\sqrt{\sigma^2_{y_2}}}. \] Again, we change the denominator from \(N-1\) to \(N\) to be consistent with the Mplus example, which calculates covariances using maximum likelihood.

round(cov(hfsSituations[c("Sit1r", "Sit2", "Sit3r", "Sit4", "Sit5r", "Sit6")])*1102/1103, digits = 3)
      Sit1r  Sit2 Sit3r  Sit4 Sit5r  Sit6
Sit1r 3.209 0.914 1.905 0.876 1.391 0.768
Sit2  0.914 2.007 0.945 1.170 0.629 0.994
Sit3r 1.905 0.945 2.616 0.909 1.386 0.769
Sit4  0.876 1.170 0.909 1.976 0.673 0.934
Sit5r 1.391 0.629 1.386 0.673 2.979 0.502
Sit6  0.768 0.994 0.769 0.934 0.502 2.281

Sample Item Response Distributions

The assumptions of CFA (i.e., normally distributed factors, no item-level factor interactions, and conditionally normal distributed items) lead to the overall assumption that our item responses must be normally distributed. From the histograms below, do you think these are normally distributed?

#stack data
melted = melt(hfsSituations, id.vars = "PersonID")
#plot by variable
ggplot(melted, aes(value)) + geom_density() + facet_wrap(~ variable)

Lavaan Syntax

lavaan syntax is constructed using a long text string and saved in a character object. In the example below model01SyntaxLong = " opens the text string, which continues for multiple lines until the final " terminates the string. The variable model01Syntax now contains the text of the lavaan syntax. Within the syntax string, the R comment character # still functions to comment text around the syntax. Each part of the model syntax corresponds to a set of parameters in the CFA model. You can find information about lavaan at http://lavaan.ugent.be.

model01SyntaxLong = "
# Model 1 --  Fully Z-Scored Factor Identification Approach
# Item factor loadings --> list the factor to the left of the =~ and the items to the right, separated by a plus
# Once the factor is defined it can be used in syntax as if it is an observed variable
# Parameters can be fixed to constant values by using the * command; starting values can be specified by using start()*; labels can be implemented with []
Sit =~ Sit1r + Sit2 + Sit3r + Sit4 + Sit5r + Sit6
# Item intercepts --> ~ 1 indicates means or intercepts
# You can put multiple lines of syntax on a single line using a ; 
Sit1r ~ 1; Sit2 ~ 1; Sit3r  ~ 1; Sit4 ~ 1; Sit5r ~ 1; Sit6 ~ 1;
# Item error (unique) variances and covariances --> use the ~~ command
Sit1r ~~ Sit1r; Sit2 ~~ Sit2; Sit3r ~~ Sit3r; Sit4 ~~ Sit4; Sit5r ~~ Sit5r; Sit6 ~~ Sit6; 
# Factor variance
Sit ~~ 100*Sit
# Factor mean (intercept)
Sit ~ 0
"

To run lavaan, the syntax string variable is passed to the lavaan(model = model01SyntaxLong, ...) function. In the function call, we also supply:

  • data = hfsSituations: The data frame which contains the variables in the syntax.
  • estimator = "MLR": Enables the use of robust maximum likelihood estimation, which helps for data that are not entirely normal.
  • mimic = "mplus": Ensures compatibility with Mplus output, which is often needed in practice (and is certainly needed for our homework system).
  • std.lv = TRUE: Uses the Z-score method of identification, setting all latent variable means to zero, all latent variable variances to one, and estimating all factor loadings.
model01Estimates = lavaan(model = model01SyntaxLong, data = hfsSituations, estimator = "MLR", mimic = "mplus", std.lv = FALSE)

If the model estimation was successful, no errors would be displayed (and no output would be shown). All model information and results are contained within the model01Estimates object. To access a summary of the results use the summary() function with the following arguments:

  • model01Estimates: The analysis to be summarized.
  • fit.measures=TRUE: Provides model fit indices in summary.
  • rsqaure=TRUE: Provides the proportion of variance “explained” for each variable (\(R^2\)).
  • standardized = TRUE: Provides standardized estimates in the summary.
summary(model01Estimates, fit.measures = TRUE, rsquare = TRUE, standardized = TRUE)
lavaan (0.5-23.1097) converged normally after  39 iterations

  Number of observations                          1103

  Number of missing patterns                         1

  Estimator                                         ML      Robust
  Minimum Function Test Statistic              387.333     467.448
  Degrees of freedom                                 9           9
  P-value (Chi-square)                           0.000       0.000
  Scaling correction factor                                  0.829
    for the Yuan-Bentler correction (Mplus variant)

Model test baseline model:

  Minimum Function Test Statistic             2076.990    2090.854
  Degrees of freedom                                15          15
  P-value                                        0.000       0.000

User model versus baseline model:

  Comparative Fit Index (CFI)                    0.817       0.779
  Tucker-Lewis Index (TLI)                       0.694       0.632

  Robust Comparative Fit Index (CFI)                         0.816
  Robust Tucker-Lewis Index (TLI)                            0.693

Loglikelihood and Information Criteria:

  Loglikelihood user model (H0)             -11535.690  -11535.690
  Scaling correction factor                                  1.076
    for the MLR correction
  Loglikelihood unrestricted model (H1)     -11342.023  -11342.023
  Scaling correction factor                                  0.994
    for the MLR correction

  Number of free parameters                         18          18
  Akaike (AIC)                               23107.380   23107.380
  Bayesian (BIC)                             23197.484   23197.484
  Sample-size adjusted Bayesian (BIC)        23140.311   23140.311

Root Mean Square Error of Approximation:

  RMSEA                                          0.195       0.215
  90 Percent Confidence Interval          0.179  0.212       0.197  0.233
  P-value RMSEA <= 0.05                          0.000       0.000

  Robust RMSEA                                               0.196
  90 Percent Confidence Interval                             0.181  0.211

Standardized Root Mean Square Residual:

  SRMR                                           0.079       0.079

Parameter Estimates:

  Information                                 Observed
  Standard Errors                   Robust.huber.white

Latent Variables:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  Sit =~                                                                
    Sit1r             0.132    0.006   21.977    0.000    1.317    0.735
    Sit2              0.084    0.006   14.718    0.000    0.844    0.596
    Sit3r             0.129    0.006   23.254    0.000    1.287    0.796
    Sit4              0.082    0.006   14.340    0.000    0.824    0.586
    Sit5r             0.097    0.006   17.607    0.000    0.970    0.562
    Sit6              0.072    0.006   12.137    0.000    0.720    0.477

Intercepts:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .Sit1r             4.443    0.054   82.373    0.000    4.443    2.480
   .Sit2              5.287    0.043  123.919    0.000    5.287    3.731
   .Sit3r             4.778    0.049   98.116    0.000    4.778    2.954
   .Sit4              5.314    0.042  125.550    0.000    5.314    3.780
   .Sit5r             4.761    0.052   91.624    0.000    4.761    2.759
   .Sit6              5.263    0.045  115.739    0.000    5.263    3.485
    Sit               0.000                               0.000    0.000

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .Sit1r             1.475    0.108   13.649    0.000    1.475    0.460
   .Sit2              1.294    0.082   15.747    0.000    1.294    0.645
   .Sit3r             0.959    0.093   10.279    0.000    0.959    0.367
   .Sit4              1.298    0.081   16.072    0.000    1.298    0.657
   .Sit5r             2.037    0.108   18.787    0.000    2.037    0.684
   .Sit6              1.762    0.088   20.034    0.000    1.762    0.772
    Sit             100.000                               1.000    1.000

R-Square:
                   Estimate
    Sit1r             0.540
    Sit2              0.355
    Sit3r             0.633
    Sit4              0.343
    Sit5r             0.316
    Sit6              0.228

Each section contains different portions of model information.

Unstandardized Model Parameter Estmates

FACTOR LOADINGS (regression slopes of item response on factor)
Latent Variables:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  Sit =~                                                                
    Sit1r             1.234    0.069   17.906    0.000    1.234    0.707
    Sit2              0.702    0.074    9.441    0.000    0.702    0.509
    Sit3r             1.241    0.063   19.847    0.000    1.241    0.778
    Sit4              0.784    0.069   11.333    0.000    0.784    0.559
    Sit5r             1.023    0.053   19.179    0.000    1.023    0.596
    Sit6              0.819    0.069   11.942    0.000    0.819    0.535
Intercepts (of Items) – HERE, ARE ACTUAL ITEM MEANS BECAUSE FACTOR MEAN IS ZERO

Note: the last term in the list, Sit is the “intercept” (mean) of the factor. As it does not have a standard error, this indicates it was fixed to zero and not estimated.

Intercepts:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .Sit1r             4.547    0.053   86.474    0.000    4.547    2.604
   .Sit2              5.289    0.042  127.346    0.000    5.289    3.834
   .Sit3r             4.896    0.048  101.959    0.000    4.896    3.070
   .Sit4              5.359    0.042  126.896    0.000    5.359    3.821
   .Sit5r             4.860    0.052   94.060    0.000    4.860    2.832
   .Sit6              5.321    0.046  115.492    0.000    5.321    3.477
    Sit               0.000                               0.000    0.000
Residual (Unique) Variances (variance of error terms)

Note: the last term in the list, Sit, is the variance of the factor. As it does not have a standard error, this indicates it was fixed to one (from the std.lv = TRUE option we used in the lavaan() function call) and not estimated.

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .Sit1r             1.526    0.149   10.217    0.000    1.526    0.500
   .Sit2              1.409    0.128   11.014    0.000    1.409    0.741
   .Sit3r             1.004    0.135    7.456    0.000    1.004    0.395
   .Sit4              1.352    0.127   10.672    0.000    1.352    0.687
   .Sit5r             1.899    0.118   16.025    0.000    1.899    0.645
   .Sit6              1.671    0.159   10.517    0.000    1.671    0.714
    Sit               1.000                               1.000    1.000

Making use of the unstandardized model estimates:

Writing out the model—individual predicted values:

  • \(Y_1 = \mu_1 + \lambda_1F + e_1 = 4.547 + 1.234F + e_1\)

Writing out the model—predicted item variances and covariances:

  • \(Var\left( Y_1 \right) = \left( \lambda^2_1 \right) Var\left(F\right) + Var\left(e_1\right) = (1.234^2)*(1) + 1.526 = 3.049\) (= original item variance)

  • \(Cov\left( Y_1, Y_2 \right) = \lambda_1 Var\left(F\right) \lambda_2 = (1.234)*(1)*(.702) = .866\) (actual covariance = .577, so the model over-predicted how related items 1 and 2 should be)

Plotting Path Diagrams

The R package semPlot helps provide a path diagram of a model. Here is an example:

semPaths(object = model01Estimates, what = "est")

Standardized Model Parameter Estimates

The last two columns of the summary output, Std.lv and Std.all, contain the standardized model parameter estimates. To understand the difference between standardized and unstandardized parameter estimates, let’s start with the standardized estimates. Standardized regression coefficients (and factor loadings) have a scale that is “units of Y” per “unit of X”. That is, the slope/loading, represents the increase in the dependent variable \(X\) per unit of the independent variable \(X\) (in our case, the factor). The units of \(Y\) are given by the standard deviation of \(Y\), or \(SD(Y)\). Similarly, the units of \(X\) are given by the standard deviation of \(X\), or \(SD(X)\). You can think of the units associated by the fraction \(\frac{SD(Y)}{SD(X)}\). So, the first factor loading (a value of 1.234 for the item Sit1r) indicates the numeric response to the item goes up by 1.234 for every one-unit increase in the factor Sit.

The process of standardization removes the units attached to the parameter. So, if the unstandardized factor loadings are expressed in units of \(\frac{SD(Y)}{SD(X)}\), the standardized units are achieved by either dividing or multiplying by the appropriate standard deviation to make the numerator or denominator of \(\frac{SD(Y)}{SD(X)}\) equal to one. For the standardized estimates under std.lv, the units of the factor \((SD(X))\) are removed (yielding \(\frac{SD(Y)}{1}\)) by multiplying the estimate by the factor standard deviation. Because the factor standard deviation is set to one, all estimates in this column are the same as the unstandardized estimate. These unstandardized estimates can be used to see how parameters would look under the Z-score identification method another identification method was used.

The standardized estimates listed under std.all are formed by multiplying the estimate by the standard deviation of the factor and dividing that by the unconditional (raw) standard deviation of the item. For instance, the “fully” standardized factor loading of the first item is found by multiplying the unstandardized coefficient (1.234) by one (the factor standard deviation) and dividing by the item’s standard deviation (\(\sqrt{3.049}\) – the square root of the item variance shown at the beginning of this example). The resulting value, \(1.234\times\frac{1}{\sqrt{3.049}} = 0.707\), represents the factor loading would the analysis have been run (a) with a Z-score factor identification and (b) on an item that was a Z-score.

The process of standardization is the same for all parameters of the model: intercepts, loadings, and residual (unique) variances. The interpretation of standardized item intercepts is difficult and often these are not reported. The standardized versions of factor loadings and unique variances are commonly reported.

Moreover, in for items measuring one factor, the standardized loadings lead directly to the \(R^2\) estimate – the amount of variance in the item responses explained by the factor. The item \(R^2\) is found by the square of the unstandardized factor loading. The R-Square information is found at the end of the model summary, so long as the option rsquare = TRUE is specified in the summary() function call.

R-Square:
                   Estimate
    Sit1r             0.500
    Sit2              0.259
    Sit3r             0.605
    Sit4              0.313
    Sit5r             0.355
    Sit6              0.286

Shortening Lavaan Syntax: Lavaan Default Options

The syntax above is designed to show the mapping of all CFA parameters to all lavaan syntax elements. In reality, all you would need to write to define this model is:

model01SyntaxShort = "
Sit =~ Sit1r + Sit2 + Sit3r + Sit4 + Sit5r + Sit6
"

The syntax input into the sem() function uses some defaults:

  • All item intercepts are estimated (each item gets its own) and the factor mean is fixed at zero.
  • All item error (unique) variances are estimated (each item gets its own).
  • All factor variances and covariances are estimated (to do so, the first item after the =~ has its factor loading set to 1 – a marker item identification method)

Similarly, to run this syntax the sem function is now called. To see the results, the summary function is used. The sem function simplifies the syntax needed to conduct a confirmatory factor analysis. These are demonstrated below. Note the output is identical to what was run previously.

model01EstimatesShort = sem(model = model01SyntaxShort, data = hfsSituations, estimator = "MLR", mimic = "mplus", std.lv = TRUE)
summary(model01EstimatesShort, fit.measures = TRUE, rsquare = TRUE, standardized = TRUE)
lavaan (0.5-23.1097) converged normally after  20 iterations

  Number of observations                          1103

  Number of missing patterns                         1

  Estimator                                         ML      Robust
  Minimum Function Test Statistic              387.333     467.448
  Degrees of freedom                                 9           9
  P-value (Chi-square)                           0.000       0.000
  Scaling correction factor                                  0.829
    for the Yuan-Bentler correction (Mplus variant)

Model test baseline model:

  Minimum Function Test Statistic             2076.990    2090.854
  Degrees of freedom                                15          15
  P-value                                        0.000       0.000

User model versus baseline model:

  Comparative Fit Index (CFI)                    0.817       0.779
  Tucker-Lewis Index (TLI)                       0.694       0.632

  Robust Comparative Fit Index (CFI)                         0.816
  Robust Tucker-Lewis Index (TLI)                            0.693

Loglikelihood and Information Criteria:

  Loglikelihood user model (H0)             -11535.690  -11535.690
  Scaling correction factor                                  1.076
    for the MLR correction
  Loglikelihood unrestricted model (H1)     -11342.023  -11342.023
  Scaling correction factor                                  0.994
    for the MLR correction

  Number of free parameters                         18          18
  Akaike (AIC)                               23107.380   23107.380
  Bayesian (BIC)                             23197.484   23197.484
  Sample-size adjusted Bayesian (BIC)        23140.311   23140.311

Root Mean Square Error of Approximation:

  RMSEA                                          0.195       0.215
  90 Percent Confidence Interval          0.179  0.212       0.197  0.233
  P-value RMSEA <= 0.05                          0.000       0.000

  Robust RMSEA                                               0.196
  90 Percent Confidence Interval                             0.181  0.211

Standardized Root Mean Square Residual:

  SRMR                                           0.079       0.079

Parameter Estimates:

  Information                                 Observed
  Standard Errors                   Robust.huber.white

Latent Variables:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  Sit =~                                                                
    Sit1r             1.317    0.060   21.977    0.000    1.317    0.735
    Sit2              0.844    0.057   14.718    0.000    0.844    0.596
    Sit3r             1.287    0.055   23.255    0.000    1.287    0.796
    Sit4              0.824    0.057   14.340    0.000    0.824    0.586
    Sit5r             0.970    0.055   17.607    0.000    0.970    0.562
    Sit6              0.720    0.059   12.137    0.000    0.720    0.477

Intercepts:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .Sit1r             4.443    0.054   82.373    0.000    4.443    2.480
   .Sit2              5.287    0.043  123.919    0.000    5.287    3.731
   .Sit3r             4.778    0.049   98.116    0.000    4.778    2.954
   .Sit4              5.314    0.042  125.550    0.000    5.314    3.780
   .Sit5r             4.761    0.052   91.624    0.000    4.761    2.759
   .Sit6              5.263    0.045  115.739    0.000    5.263    3.485
    Sit               0.000                               0.000    0.000

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .Sit1r             1.475    0.108   13.649    0.000    1.475    0.460
   .Sit2              1.294    0.082   15.747    0.000    1.294    0.645
   .Sit3r             0.959    0.093   10.279    0.000    0.959    0.367
   .Sit4              1.298    0.081   16.072    0.000    1.298    0.657
   .Sit5r             2.037    0.108   18.787    0.000    2.037    0.684
   .Sit6              1.762    0.088   20.034    0.000    1.762    0.772
    Sit               1.000                               1.000    1.000

R-Square:
                   Estimate
    Sit1r             0.540
    Sit2              0.355
    Sit3r             0.633
    Sit4              0.343
    Sit5r             0.316
    Sit6              0.228

Alternative Identification Methods

There are multiple equivalent ways of getting the same CFA model, but with different scaling for the factor mean and variance (i.e., different means of identification). Now let’s see the model parameters when using the marker item for model identification instead. In the marker item identification method:

  • The first factor loading of a factor (the first variable after the =~ symbol in lavaan syntax) is set to one
  • The factor variance is estimated
  • The factor mean is set to zero
  • All item intercepts and unique variances are estimated (as done in the Z-score identification method)