The Forgiveness of Situations Subscale includes six items, three of which are reverse-coded, on a seven-point scale:
Response Anchors:
For this example, we will be using the ggplot2
(for general plotting), melt2
(for data reshaping before plotting), and lavaan
(for CFA) packages.
if (require(ggplot2)==FALSE){
install.packages("ggplot2")
}
Loading required package: ggplot2
library(ggplot2)
if (require(reshape2) == FALSE){
install.packages("reshape2")
}
Loading required package: reshape2
if (require(lavaan) == FALSE){
install.packages("lavaan")
}
Loading required package: lavaan
This is lavaan 0.5-23.1097
lavaan is BETA software! Please report any bugs.
library(lavaan)
if (require(psych) == FALSE){
install.packages("psych")
}
Loading required package: psych
Attaching package: ‘psych’
The following object is masked from ‘package:lavaan’:
cor2cov
The following objects are masked from ‘package:ggplot2’:
%+%, alpha
if (require(knitr) == FALSE){
install.packages("knitr")
}
Loading required package: knitr
if (require(kableExtra) == FALSE){
install.packages("kableExtra")
}
Loading required package: kableExtra
The data are in a text file named Study2.dat
orignally used in Mplus (so no column names were included at the top of the file). The file contains more items than we will use, so we select only the “Forgiveness of Situations” items from the whole file.
#read in data file (Mplus file format so having to label columns)
hfsData = read.table(file = "Study2.dat", header = FALSE, na.strings = "99999", col.names = c("PersonID", "Self1", "Self2r", "Self3", "Self4r", "Self5", "Self6r", "Other1r", "Other2", "Other3r", "Other4", "Other5r", "Other6", "Sit1r", "Sit2", "Sit3r", "Sit4", "Sit5r", "Sit6", "Selfsub", "Othsub", "Sitsub", "HFSsum"))
#select Situations items and PersonID variables
hfsSituations = hfsData[c("PersonID", "Sit1r", "Sit2", "Sit3r", "Sit4", "Sit5r", "Sit6")]
The observed correlation matrix, rounded to three digits:
#here the c() function selects only the variables, not the PersionID variable
round(cor(hfsSituations[c("Sit1r", "Sit2", "Sit3r", "Sit4", "Sit5r", "Sit6")]), digits = 3)
Sit1r Sit2 Sit3r Sit4 Sit5r Sit6
Sit1r 1.000 0.240 0.647 0.300 0.453 0.297
Sit2 0.240 1.000 0.317 0.570 0.255 0.457
Sit3r 0.647 0.317 1.000 0.369 0.482 0.356
Sit4 0.300 0.570 0.369 1.000 0.289 0.448
Sit5r 0.453 0.255 0.482 0.289 1.000 0.304
Sit6 0.297 0.457 0.356 0.448 0.304 1.000
The observed means, rounded to three digits:
apply(X = hfsSituations[c("Sit1r", "Sit2", "Sit3r", "Sit4", "Sit5r", "Sit6")], MARGIN = 2, FUN = function(x) round(mean(x), digits = 3))
Sit1r Sit2 Sit3r Sit4 Sit5r Sit6
4.547 5.289 4.896 5.359 4.860 5.321
The observed variances (using \(N\) in the denominator to match ML estimated output and Mplus example), rounded to three digits:
apply(X = hfsSituations[c("Sit1r", "Sit2", "Sit3r", "Sit4", "Sit5r", "Sit6")], MARGIN = 2, FUN = function(x) round(var(x, na.rm = TRUE)*1102/1103, digits = 3))
Sit1r Sit2 Sit3r Sit4 Sit5r Sit6
3.049 1.903 2.543 1.967 2.945 2.341
To do a CFA analysis, you only really need means, variances, and either correlations or covariances among items. That said, modern methods of estimation use the raw data (often called full information) rather than the summary statistics as the raw data enable better missing data assumptions when using maximum likelihood and Bayesian estimation methods.
The sample covariance matrix can be found from the sample correlations and variances. Each covariance between a pair of variables \(y_1\) and \(y_2\) is denoted with a \(\sigma_{y_1, y_2}\) and each correlation is denoted with a \(\rho_{y_1, y_2}\). The variance of a variable is denoted by \(\sigma^2_{y_1}\) and the standard deviation of a variable is the square root of the variance \(\sqrt{\sigma^2_{y_1}}\). The covariance can be found by taking the correlation and multiplying it by the product of the standard deviations.
\[\sigma_{y_1, y_2} = \rho_{y_1, y_2}\sqrt{\sigma^2_{y_1}}\sqrt{\sigma^2_{y_2}}. \]
Inversely, the correlation can be found by taking the covariance and dividing it by the product of the standard deviations:
\[\rho_{y_1, y_2} = \frac{\sigma_{y_1, y_2}}{\sqrt{\sigma^2_{y_1}}\sqrt{\sigma^2_{y_2}}}. \] Again, we change the denominator from \(N-1\) to \(N\) to be consistent with the Mplus example, which calculates covariances using maximum likelihood.
round(cov(hfsSituations[c("Sit1r", "Sit2", "Sit3r", "Sit4", "Sit5r", "Sit6")])*1102/1103, digits = 3)
Sit1r Sit2 Sit3r Sit4 Sit5r Sit6
Sit1r 3.049 0.577 1.802 0.734 1.358 0.795
Sit2 0.577 1.903 0.697 1.103 0.604 0.965
Sit3r 1.802 0.697 2.543 0.824 1.319 0.868
Sit4 0.734 1.103 0.824 1.967 0.695 0.962
Sit5r 1.358 0.604 1.319 0.695 2.945 0.798
Sit6 0.795 0.965 0.868 0.962 0.798 2.341
The assumptions of CFA (i.e., normally distributed factors, no item-level factor interactions, and conditionally normal distributed items) lead to the overall assumption that our item responses must be normally distributed. From the histograms below, do you think these are normally distributed?
#stack data
melted = melt(hfsSituations, id.vars = "PersonID")
#plot by variable
ggplot(melted, aes(value)) + geom_bar() + facet_wrap(~ variable)
lavaan
syntax is constructed using a long text string and saved in a character object. In the example below model01SyntaxLong = "
opens the text string, which continues for multiple lines until the final "
terminates the string. The variable model01Syntax now contains the text of the lavaan
syntax. Within the syntax string, the R comment character #
still functions to comment text around the syntax. Each part of the model syntax corresponds to a set of parameters in the CFA model. You can find information about lavaan at http://lavaan.ugent.be.
model01SyntaxLong = "
# Model 1 -- Fully Z-Scored Factor Identification Approach
# Item factor loadings --> list the factor to the left of the =~ and the items to the right, separated by a plus
# Once the factor is defined it can be used in syntax as if it is an observed variable
# Parameters can be fixed to constant values by using the * command; starting values can be specified by using start()*; labels can be implemented with []
Sit =~ Sit1r + Sit2 + Sit3r + Sit4 + Sit5r + Sit6
# Item intercepts --> ~ 1 indicates means or intercepts
# You can put multiple lines of syntax on a single line using a ;
Sit1r ~ 1; Sit2 ~ 1; Sit3r ~ 1; Sit4 ~ 1; Sit5r ~ 1; Sit6 ~ 1;
# Item error (unique) variances and covariances --> use the ~~ command
Sit1r ~~ Sit1r; Sit2 ~~ Sit2; Sit3r ~~ Sit3r; Sit4 ~~ Sit4; Sit5r ~~ Sit5r; Sit6 ~~ Sit6;
# Factor variance
Sit ~~ Sit
# Factor mean (intercept)
Sit ~ 0
"
To run lavaan, the syntax string variable is passed to the lavaan(model = model01SyntaxLong, ...)
function. In the function call, we also supply:
data = hfsSituations
: The data frame which contains the variables in the syntax.estimator = "MLR"
: Enables the use of robust maximum likelihood estimation, which helps for data that are not entirely normal.mimic = "mplus"
: Ensures compatability with Mplus output, which is often needed in practice (and is certainly needed for our homework system).std.lv = TRUE
: Uses the Z-score method of identification, setting all latent variable means to zero, all latent variable variances to one, and estimating all factor loadings.model01Estimates = lavaan(model = model01SyntaxLong, data = hfsSituations, estimator = "MLR", mimic = "mplus", std.lv = TRUE)
If the model estimation was successful, no errors would be displayed (and no output would be shown). All model information and results are contained within the model01Estimates
object. To access a summary of the results use the summary()
function with the following arugments:
model01Estimates
: The analysis to be summarized.fit.measures=TRUE
: Provides model fit indices in summary.rsqaure=TRUE
: Provides the proportion of variance “explained” for each variable (\(R^2\)).standardized = TRUE
: Provides standardized estimates in the summary.summary(model01Estimates, fit.measures = TRUE, rsquare = TRUE, standardized = TRUE)
lavaan (0.5-23.1097) converged normally after 20 iterations
Number of observations 1103
Number of missing patterns 1
Estimator ML Robust
Minimum Function Test Statistic 427.937 307.804
Degrees of freedom 9 9
P-value (Chi-square) 0.000 0.000
Scaling correction factor 1.390
for the Yuan-Bentler correction (Mplus variant)
Model test baseline model:
Minimum Function Test Statistic 1981.034 1128.693
Degrees of freedom 15 15
P-value 0.000 0.000
User model versus baseline model:
Comparative Fit Index (CFI) 0.787 0.732
Tucker-Lewis Index (TLI) 0.645 0.553
Robust Comparative Fit Index (CFI) 0.787
Robust Tucker-Lewis Index (TLI) 0.646
Loglikelihood and Information Criteria:
Loglikelihood user model (H0) -11536.404 -11536.404
Scaling correction factor 1.416
for the MLR correction
Loglikelihood unrestricted model (H1) -11322.435 -11322.435
Scaling correction factor 1.407
for the MLR correction
Number of free parameters 18 18
Akaike (AIC) 23108.808 23108.808
Bayesian (BIC) 23198.912 23198.912
Sample-size adjusted Bayesian (BIC) 23141.739 23141.739
Root Mean Square Error of Approximation:
RMSEA 0.205 0.173
90 Percent Confidence Interval 0.189 0.222 0.160 0.188
P-value RMSEA <= 0.05 0.000 0.000
Robust RMSEA 0.205
90 Percent Confidence Interval 0.185 0.224
Standardized Root Mean Square Residual:
SRMR 0.086 0.086
Parameter Estimates:
Information Observed
Standard Errors Robust.huber.white
Latent Variables:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
Sit =~
Sit1r 1.234 0.069 17.906 0.000 1.234 0.707
Sit2 0.702 0.074 9.441 0.000 0.702 0.509
Sit3r 1.241 0.063 19.847 0.000 1.241 0.778
Sit4 0.784 0.069 11.333 0.000 0.784 0.559
Sit5r 1.023 0.053 19.179 0.000 1.023 0.596
Sit6 0.819 0.069 11.942 0.000 0.819 0.535
Intercepts:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.Sit1r 4.547 0.053 86.474 0.000 4.547 2.604
.Sit2 5.289 0.042 127.346 0.000 5.289 3.834
.Sit3r 4.896 0.048 101.959 0.000 4.896 3.070
.Sit4 5.359 0.042 126.896 0.000 5.359 3.821
.Sit5r 4.860 0.052 94.060 0.000 4.860 2.832
.Sit6 5.321 0.046 115.492 0.000 5.321 3.477
Sit 0.000 0.000 0.000
Variances:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.Sit1r 1.526 0.149 10.217 0.000 1.526 0.500
.Sit2 1.409 0.128 11.014 0.000 1.409 0.741
.Sit3r 1.004 0.135 7.456 0.000 1.004 0.395
.Sit4 1.352 0.127 10.672 0.000 1.352 0.687
.Sit5r 1.899 0.118 16.025 0.000 1.899 0.645
.Sit6 1.671 0.159 10.517 0.000 1.671 0.714
Sit 1.000 1.000 1.000
R-Square:
Estimate
Sit1r 0.500
Sit2 0.259
Sit3r 0.605
Sit4 0.313
Sit5r 0.355
Sit6 0.286
Each section contains different portions of model information.
Latent Variables:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
Sit =~
Sit1r 1.234 0.069 17.906 0.000 1.234 0.707
Sit2 0.702 0.074 9.441 0.000 0.702 0.509
Sit3r 1.241 0.063 19.847 0.000 1.241 0.778
Sit4 0.784 0.069 11.333 0.000 0.784 0.559
Sit5r 1.023 0.053 19.179 0.000 1.023 0.596
Sit6 0.819 0.069 11.942 0.000 0.819 0.535
Note: the last term in the list, Sit
is the “intercept” (mean) of the factor. As it does not have a standard error, this indicates it was fixed to zero and not estimated.
Intercepts:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.Sit1r 4.547 0.053 86.474 0.000 4.547 2.604
.Sit2 5.289 0.042 127.346 0.000 5.289 3.834
.Sit3r 4.896 0.048 101.959 0.000 4.896 3.070
.Sit4 5.359 0.042 126.896 0.000 5.359 3.821
.Sit5r 4.860 0.052 94.060 0.000 4.860 2.832
.Sit6 5.321 0.046 115.492 0.000 5.321 3.477
Sit 0.000 0.000 0.000
Note: the last term in the list, Sit
, is the variance of the factor. As it does not have a standard error, this indicates it was fixed to one (from the std.lv = TRUE
option we used in the lavaan()
function call) and not estimated.
Variances:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.Sit1r 1.526 0.149 10.217 0.000 1.526 0.500
.Sit2 1.409 0.128 11.014 0.000 1.409 0.741
.Sit3r 1.004 0.135 7.456 0.000 1.004 0.395
.Sit4 1.352 0.127 10.672 0.000 1.352 0.687
.Sit5r 1.899 0.118 16.025 0.000 1.899 0.645
.Sit6 1.671 0.159 10.517 0.000 1.671 0.714
Sit 1.000 1.000 1.000
Writing out the model—individual predicted values:
Writing out the model—predicted item variances and covariances:
\(Var\left( Y_1 \right) = \left( \lambda^2_1 \right) Var\left(F\right) + Var\left(e_1\right) = (1.234^2)*(1) + 1.526 = 3.049\) (= original item variance)
\(Cov\left( Y_1, Y_2 \right) = \lambda_1 Var\left(F\right) \lambda_2 = (1.234)*(1)*(.702) = .866\) (actual covariance = .577, so the model over-predicted how related items 1 and 2 should be)
The last two columns of the summary output, Std.lv
and Std.all
, contain the standardized model parameter estimates. To understand the difference between standardized and unstandardized parameter estimates, let’s start with the standardized estimates. Standardized regression coefficients (and factor loadings) have a scale that is “units of Y” per “unit of X”. That is, the slope/loading, represents the increase in the dependent variable \(X\) per unit of the independent variabel \(X\) (in our case, the factor). The units of \(Y\) are given by the standard deviation of \(Y\), or \(SD(Y)\). Similarly, the units of \(X\) are given by the standard deviation of \(X\), or \(SD(X)\). You can think of the units associated by the fraction \(\frac{SD(Y)}{SD(X)}\). So, the first factor loading (a value of 1.234 for the item Sit1r) indicates the numeric response to the item goes up by 1.234 for every one-unit increase in the factor Sit.
The process of standardization removes the units attached to the parameter. So, if the unstandardized factor loadings are expreesed in units of \(\frac{SD(Y)}{SD(X)}\), the standardized units are acheived by either dividing or multiplying by the appropriate standard deviation to make the numerator or denominator of \(\frac{SD(Y)}{SD(X)}\) equal to one. For the standardized estimates under std.lv
, the units of the factor \((SD(X))\) are removed (yielding \(\frac{SD(Y)}{1}\)) by multiplying the estimate by the factor standard deviation. Because the factor standard deviation is set to one, all estimates in this column are the same as the unstandardized estimate. These unstandardized estimates can be used to see how parameters would look under the Z-score identifiction method another identification method was used.
The standardized estimates listed under std.all
are formed by multiplying the estimate by the standard deviation of the factor and dividing that by the unconditional (raw) standard deviation of the item. For instance, the “fully” standardized factor loading of the first item is found by multiplying the unstandardized coefficient (1.234) by one (the factor standard deviation) and dividing by the item’s standard deviation (\(\sqrt{3.049}\) – the square root of the item variance shown at the beginning of this example). The resulting value, \(1.234\times\frac{1}{\sqrt{3.049}} = 0.707\), represents the factor loading would the analysis have been run (a) with a Z-score factor identification and (b) on an item that was a Z-score.
The process of standardization is the same for all parameters of the model: intercepts, loadings, and residual (unique) variances. The interpretation of standardized item intercepts is difficult and often these are not reported. The standardized versions of factor loadings and unique variances are commonly reported.
Moreover, in for items measuring one factor, the standardized loadings lead directly to the \(R^2\) estimate – the amount of variance in the item responses explained by the factor. The item \(R^2\) is found by the square of the unstandardized factor loading. The R-Square information is found at the end of the model summary, so long as the option rsquare = TRUE
is specified in the summary()
function call.
R-Square:
Estimate
Sit1r 0.500
Sit2 0.259
Sit3r 0.605
Sit4 0.313
Sit5r 0.355
Sit6 0.286
The syntax above is designed to show the mapping of all CFA parameters to all lavaan syntax elements. In reality, all you would need to write to define this model is:
model01SyntaxShort = "
Sit =~ Sit1r + Sit2 + Sit3r + Sit4 + Sit5r + Sit6
"
The syntax input into the sem()
function uses some defaults:
Similarly, to run this syntax the sem
fuction is now called. To see the results, the summary
function is used. The sem
function simplifies the syntax needed to conduct a confirmatory factor analysis. These are demonstrated below. Note the output is identical to what was run previously.
model01EstimatesShort = sem(model = model01SyntaxShort, data = hfsSituations, estimator = "MLR", mimic = "mplus", std.lv = TRUE)
summary(model01EstimatesShort, fit.measures = TRUE, rsquare = TRUE, standardized = TRUE)
lavaan (0.5-23.1097) converged normally after 20 iterations
Number of observations 1103
Number of missing patterns 1
Estimator ML Robust
Minimum Function Test Statistic 427.937 307.804
Degrees of freedom 9 9
P-value (Chi-square) 0.000 0.000
Scaling correction factor 1.390
for the Yuan-Bentler correction (Mplus variant)
Model test baseline model:
Minimum Function Test Statistic 1981.034 1128.693
Degrees of freedom 15 15
P-value 0.000 0.000
User model versus baseline model:
Comparative Fit Index (CFI) 0.787 0.732
Tucker-Lewis Index (TLI) 0.645 0.553
Robust Comparative Fit Index (CFI) 0.787
Robust Tucker-Lewis Index (TLI) 0.646
Loglikelihood and Information Criteria:
Loglikelihood user model (H0) -11536.404 -11536.404
Scaling correction factor 1.416
for the MLR correction
Loglikelihood unrestricted model (H1) -11322.435 -11322.435
Scaling correction factor 1.407
for the MLR correction
Number of free parameters 18 18
Akaike (AIC) 23108.808 23108.808
Bayesian (BIC) 23198.912 23198.912
Sample-size adjusted Bayesian (BIC) 23141.739 23141.739
Root Mean Square Error of Approximation:
RMSEA 0.205 0.173
90 Percent Confidence Interval 0.189 0.222 0.160 0.188
P-value RMSEA <= 0.05 0.000 0.000
Robust RMSEA 0.205
90 Percent Confidence Interval 0.185 0.224
Standardized Root Mean Square Residual:
SRMR 0.086 0.086
Parameter Estimates:
Information Observed
Standard Errors Robust.huber.white
Latent Variables:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
Sit =~
Sit1r 1.234 0.069 17.906 0.000 1.234 0.707
Sit2 0.702 0.074 9.441 0.000 0.702 0.509
Sit3r 1.241 0.063 19.847 0.000 1.241 0.778
Sit4 0.784 0.069 11.333 0.000 0.784 0.559
Sit5r 1.023 0.053 19.179 0.000 1.023 0.596
Sit6 0.819 0.069 11.942 0.000 0.819 0.535
Intercepts:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.Sit1r 4.547 0.053 86.474 0.000 4.547 2.604
.Sit2 5.289 0.042 127.346 0.000 5.289 3.834
.Sit3r 4.896 0.048 101.959 0.000 4.896 3.070
.Sit4 5.359 0.042 126.896 0.000 5.359 3.821
.Sit5r 4.860 0.052 94.060 0.000 4.860 2.832
.Sit6 5.321 0.046 115.492 0.000 5.321 3.477
Sit 0.000 0.000 0.000
Variances:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.Sit1r 1.526 0.149 10.217 0.000 1.526 0.500
.Sit2 1.409 0.128 11.014 0.000 1.409 0.741
.Sit3r 1.004 0.135 7.456 0.000 1.004 0.395
.Sit4 1.352 0.127 10.672 0.000 1.352 0.687
.Sit5r 1.899 0.118 16.025 0.000 1.899 0.645
.Sit6 1.671 0.159 10.517 0.000 1.671 0.714
Sit 1.000 1.000 1.000
R-Square:
Estimate
Sit1r 0.500
Sit2 0.259
Sit3r 0.605
Sit4 0.313
Sit5r 0.355
Sit6 0.286
There are multiple equivalent ways of getting the same CFA model, but with different scaling for the factor mean and variance (i.e., different means of identification). Now let’s see the model parameters when using the marker item for model identification instead. In the marker item identification method:
=~
symbol in lavaan
syntax) is set to oneThe lavaan
syntax is the same for the marker item identification, but the call to the lavaan()
function does not include the std.lv = TRUE
option, which defaults to the marker item method of identification:
model02Estimates = sem(model = model01SyntaxShort, data = hfsSituations, estimator = "MLR", mimic = "mplus")
summary(model02Estimates, fit.measures = TRUE, rsquare = TRUE, standardized = TRUE)
lavaan (0.5-23.1097) converged normally after 28 iterations
Number of observations 1103
Number of missing patterns 1
Estimator ML Robust
Minimum Function Test Statistic 427.937 307.804
Degrees of freedom 9 9
P-value (Chi-square) 0.000 0.000
Scaling correction factor 1.390
for the Yuan-Bentler correction (Mplus variant)
Model test baseline model:
Minimum Function Test Statistic 1981.034 1128.693
Degrees of freedom 15 15
P-value 0.000 0.000
User model versus baseline model:
Comparative Fit Index (CFI) 0.787 0.732
Tucker-Lewis Index (TLI) 0.645 0.553
Robust Comparative Fit Index (CFI) 0.787
Robust Tucker-Lewis Index (TLI) 0.646
Loglikelihood and Information Criteria:
Loglikelihood user model (H0) -11536.404 -11536.404
Scaling correction factor 1.416
for the MLR correction
Loglikelihood unrestricted model (H1) -11322.435 -11322.435
Scaling correction factor 1.407
for the MLR correction
Number of free parameters 18 18
Akaike (AIC) 23108.808 23108.808
Bayesian (BIC) 23198.912 23198.912
Sample-size adjusted Bayesian (BIC) 23141.739 23141.739
Root Mean Square Error of Approximation:
RMSEA 0.205 0.173
90 Percent Confidence Interval 0.189 0.222 0.160 0.188
P-value RMSEA <= 0.05 0.000 0.000
Robust RMSEA 0.205
90 Percent Confidence Interval 0.185 0.224
Standardized Root Mean Square Residual:
SRMR 0.086 0.086
Parameter Estimates:
Information Observed
Standard Errors Robust.huber.white
Latent Variables:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
Sit =~
Sit1r 1.000 1.234 0.707
Sit2 0.569 0.083 6.831 0.000 0.702 0.509
Sit3r 1.005 0.035 28.553 0.000 1.241 0.778
Sit4 0.636 0.082 7.741 0.000 0.784 0.559
Sit5r 0.829 0.053 15.698 0.000 1.023 0.596
Sit6 0.664 0.081 8.143 0.000 0.819 0.535
Intercepts:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.Sit1r 4.547 0.053 86.474 0.000 4.547 2.604
.Sit2 5.289 0.042 127.346 0.000 5.289 3.834
.Sit3r 4.896 0.048 101.959 0.000 4.896 3.070
.Sit4 5.359 0.042 126.896 0.000 5.359 3.821
.Sit5r 4.860 0.052 94.060 0.000 4.860 2.832
.Sit6 5.321 0.046 115.492 0.000 5.321 3.477
Sit 0.000 0.000 0.000
Variances:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.Sit1r 1.526 0.149 10.217 0.000 1.526 0.500
.Sit2 1.409 0.128 11.014 0.000 1.409 0.741
.Sit3r 1.004 0.135 7.456 0.000 1.004 0.395
.Sit4 1.352 0.127 10.672 0.000 1.352 0.687
.Sit5r 1.899 0.118 16.025 0.000 1.899 0.645
.Sit6 1.671 0.159 10.517 0.000 1.671 0.714
Sit 1.523 0.170 8.953 0.000 1.000 1.000
R-Square:
Estimate
Sit1r 0.500
Sit2 0.259
Sit3r 0.605
Sit4 0.313
Sit5r 0.355
Sit6 0.286
Here, loading for SIT1R is not tested and has no standard error (Std.Err
) because it is fixed to one. Also, note the Std.lv
estimates are identical to the unstandardized estimates in the Z-score factor identification method. Further, the Std.all
estimates are equal to the Std.all
estimates in the Z-score factor identification method. These will be equal for all identification methods.
Latent Variables:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
Sit =~
Sit1r 1.000 1.234 0.707
Sit2 0.569 0.083 6.831 0.000 0.702 0.509
Sit3r 1.005 0.035 28.553 0.000 1.241 0.778
Sit4 0.636 0.082 7.741 0.000 0.784 0.559
Sit5r 0.829 0.053 15.698 0.000 1.023 0.596
Sit6 0.664 0.081 8.143 0.000 0.819 0.535
Here, all are identical to what was found in the previous Z-score standardization. That is because we left the factor mean to be zero.
Intercepts:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.Sit1r 4.547 0.053 86.474 0.000 4.547 2.604
.Sit2 5.289 0.042 127.346 0.000 5.289 3.834
.Sit3r 4.896 0.048 101.959 0.000 4.896 3.070
.Sit4 5.359 0.042 126.896 0.000 5.359 3.821
.Sit5r 4.860 0.052 94.060 0.000 4.860 2.832
.Sit6 5.321 0.046 115.492 0.000 5.321 3.477
Sit 0.000 0.000 0.000
Here, the factor variance is estimated, which is different from the previous identification method.
Variances:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.Sit1r 1.526 0.149 10.217 0.000 1.526 0.500
.Sit2 1.409 0.128 11.014 0.000 1.409 0.741
.Sit3r 1.004 0.135 7.456 0.000 1.004 0.395
.Sit4 1.352 0.127 10.672 0.000 1.352 0.687
.Sit5r 1.899 0.118 16.025 0.000 1.899 0.645
.Sit6 1.671 0.159 10.517 0.000 1.671 0.714
Sit 1.523 0.170 8.953 0.000 1.000 1.000
To estimate this model, we have to return to the general lavaan
model function. In the syntax below you see 1*Sit1r
, setting the factor loading for Sit1r
to one (the marker item factor loading). Futher, you see Sit1r ~ 0
setting the item intercept of Sit1r
to zero, the marker item intercept. Finally, to estimate the factor mean, we use Sit ~ 1
, which estimates the factor mean explicitly.
model03Syntax = "
# Model 1 -- Fully Z-Scored Factor Identification Approach
# Item factor loadings --> list the factor to the left of the =~ and the items to the right, separated by a plus
# Once the factor is defined it can be used in syntax as if it is an observed variable
# Parameters can be fixed to constant values by using the * command; starting values can be specified by using start()*; labels can be implemented with []
Sit =~ 1*Sit1r + Sit2 + Sit3r + Sit4 + Sit5r + Sit6
# Item intercepts --> ~ 1 indicates means or intercepts
# You can put multiple lines of syntax on a single line using a ;
Sit1r ~ 0; Sit2 ~ 1; Sit3r ~ 1; Sit4 ~ 1; Sit5r ~ 1; Sit6 ~ 1;
# Item error (unique) variances and covariances --> use the ~~ command
Sit1r ~~ Sit1r; Sit2 ~~ Sit2; Sit3r ~~ Sit3r; Sit4 ~~ Sit4; Sit5r ~~ Sit5r; Sit6 ~~ Sit6;
# Factor variance
Sit ~~ Sit
# Factor mean (intercept)
Sit ~ 1
"
model03Estimates = sem(model = model03Syntax, data = hfsSituations, estimator = "MLR", mimic = "mplus")
summary(model03Estimates, fit.measures = TRUE, rsquare = TRUE, standardized = TRUE)
lavaan (0.5-23.1097) converged normally after 55 iterations
Number of observations 1103
Number of missing patterns 1
Estimator ML Robust
Minimum Function Test Statistic 427.937 307.804
Degrees of freedom 9 9
P-value (Chi-square) 0.000 0.000
Scaling correction factor 1.390
for the Yuan-Bentler correction (Mplus variant)
Model test baseline model:
Minimum Function Test Statistic 1981.034 1128.693
Degrees of freedom 15 15
P-value 0.000 0.000
User model versus baseline model:
Comparative Fit Index (CFI) 0.787 0.732
Tucker-Lewis Index (TLI) 0.645 0.553
Robust Comparative Fit Index (CFI) 0.787
Robust Tucker-Lewis Index (TLI) 0.646
Loglikelihood and Information Criteria:
Loglikelihood user model (H0) -11536.404 -11536.404
Scaling correction factor 1.416
for the MLR correction
Loglikelihood unrestricted model (H1) -11322.435 -11322.435
Scaling correction factor 1.407
for the MLR correction
Number of free parameters 18 18
Akaike (AIC) 23108.808 23108.808
Bayesian (BIC) 23198.912 23198.912
Sample-size adjusted Bayesian (BIC) 23141.739 23141.739
Root Mean Square Error of Approximation:
RMSEA 0.205 0.173
90 Percent Confidence Interval 0.189 0.222 0.160 0.188
P-value RMSEA <= 0.05 0.000 0.000
Robust RMSEA 0.205
90 Percent Confidence Interval 0.185 0.224
Standardized Root Mean Square Residual:
SRMR 0.086 0.086
Parameter Estimates:
Information Observed
Standard Errors Robust.huber.white
Latent Variables:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
Sit =~
Sit1r 1.000 1.234 0.707
Sit2 0.569 0.083 6.831 0.000 0.702 0.509
Sit3r 1.005 0.035 28.553 0.000 1.241 0.778
Sit4 0.636 0.082 7.741 0.000 0.784 0.559
Sit5r 0.829 0.053 15.698 0.000 1.023 0.596
Sit6 0.664 0.081 8.143 0.000 0.819 0.535
Intercepts:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.Sit1r 0.000 0.000 0.000
.Sit2 2.701 0.383 7.045 0.000 2.701 1.958
.Sit3r 0.325 0.171 1.899 0.058 0.325 0.204
.Sit4 2.469 0.380 6.502 0.000 2.469 1.760
.Sit5r 1.092 0.246 4.431 0.000 1.092 0.636
.Sit6 2.304 0.369 6.249 0.000 2.304 1.506
Sit 4.547 0.053 86.474 0.000 3.684 3.684
Variances:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.Sit1r 1.526 0.149 10.217 0.000 1.526 0.500
.Sit2 1.409 0.128 11.014 0.000 1.409 0.741
.Sit3r 1.004 0.135 7.456 0.000 1.004 0.395
.Sit4 1.352 0.127 10.672 0.000 1.352 0.687
.Sit5r 1.899 0.118 16.025 0.000 1.899 0.645
.Sit6 1.671 0.159 10.517 0.000 1.671 0.714
Sit 1.523 0.170 8.953 0.000 1.000 1.000
R-Square:
Estimate
Sit1r 0.500
Sit2 0.259
Sit3r 0.605
Sit4 0.313
Sit5r 0.355
Sit6 0.286
Here, you will note the model is identical to the previous two – the log-likelihood is -11536.404, same as previously. Now, the itercepts has the mean of item Sit1r
set to zero and the mean of the factor estimated.
Intercepts:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.Sit1r 0.000 0.000 0.000
.Sit2 2.701 0.383 7.045 0.000 2.701 1.958
.Sit3r 0.325 0.171 1.899 0.058 0.325 0.204
.Sit4 2.469 0.380 6.502 0.000 2.469 1.760
.Sit5r 1.092 0.246 4.431 0.000 1.092 0.636
.Sit6 2.304 0.369 6.249 0.000 2.304 1.506
Sit 4.547 0.053 86.474 0.000 3.684 3.684
The mean of the factor is 4.547, which, in this case, is the mean of item Sit1r
. That is because the factor loading of the item is set to 1.0. You will also notice that the other item intercepts have changed from previous models. In general, an item’s mean is found by adding the item intercept to the product of the factor loading times the factor mean, or \(\mu_i + \lambda_i\mu_F\).
Calculating model degrees of freedom (v is number of observed variables in a model):
Loglikelihood and Information Criteria:
Loglikelihood user model (H0) -11536.404 -11536.404
Scaling correction factor 1.416
for the MLR correction
Loglikelihood unrestricted model (H1) -11322.435 -11322.435
Scaling correction factor 1.407
for the MLR correction
Number of free parameters 18 18
Akaike (AIC) 23108.808 23108.808
Bayesian (BIC) 23198.912 23198.912
Sample-size adjusted Bayesian (BIC) 23141.739 23141.739
Estimator ML Robust
Minimum Function Test Statistic 427.937 307.804
Degrees of freedom 9 9
P-value (Chi-square) 0.000 0.000
Scaling correction factor 1.390
for the Yuan-Bentler correction (Mplus variant)
Where does this \(\chi^2\) value for “model fit” come from? A rescaled −2LL model comparison of this one-factor model (H0) against the saturated model (H1) that perfectly reproduces the data covariances:
How to fit the saturated (Unstructured) Baseline Model: Item means, variances, and covariances in original data:
modelSaturated = "
# Item intercepts --> ~ 1 indicates means or intercepts
# You can put multiple lines of syntax on a single line using a ;
Sit1r ~ 1; Sit2 ~ 1; Sit3r ~ 1; Sit4 ~ 1; Sit5r ~ 1; Sit6 ~ 1;
# Item variances and covariances --> use the ~~ command
# Variances:
Sit1r ~~ Sit1r; Sit2 ~~ Sit2; Sit3r ~~ Sit3r; Sit4 ~~ Sit4; Sit5r ~~ Sit5r; Sit6 ~~ Sit6;
# Covariances:
Sit1r ~~ Sit2; Sit1r ~~ Sit3r; Sit1r ~~ Sit4; Sit1r ~~ Sit5r; Sit1r ~~ Sit6;
Sit2 ~~ Sit3r; Sit2 ~~ Sit4; Sit2 ~~ Sit5r; Sit2 ~~ Sit6;
Sit3r ~~ Sit4; Sit3r ~~ Sit5r; Sit3r ~~ Sit6;
Sit4 ~~ Sit5r; Sit4 ~~ Sit6;
Sit5r ~~ Sit6;
"
modelSaturatedEstimates = sem(model = modelSaturated, data = hfsSituations, estimator = "MLR", mimic = "mplus")
summary(modelSaturatedEstimates, fit.measures = TRUE)
Note that H0 and H1 are now the same! Our H0 model IS the H1 saturated model: the log-likelihood of H0 and H1 are the same. Also, note there are 27 free parameters (six item means, six item variances, and 15 covariances).
Loglikelihood and Information Criteria:
Loglikelihood user model (H0) -11322.435 -11322.435
Loglikelihood unrestricted model (H1) -11322.435 -11322.435
Number of free parameters 27 27
Akaike (AIC) 22698.870 22698.870
Bayesian (BIC) 22834.027 22834.027
Sample-size adjusted Bayesian (BIC) 22748.268 22748.268
Also note how the model \(\chi^2\) test has zero degrees of freedom: this is because all 27 possible free parameters were estimated. This model fits the data perfectly.
Estimator ML Robust
Minimum Function Test Statistic 0.000 0.000
Degrees of freedom 0 0
Minimum Function Value 0.0000000000000
Scaling correction factor NA
for the Yuan-Bentler correction (Mplus variant)
User model versus baseline model:
Comparative Fit Index (CFI) 0.787 0.732
Tucker-Lewis Index (TLI) 0.645 0.553
Robust Comparative Fit Index (CFI) 0.787
Robust Tucker-Lewis Index (TLI) 0.646
Root Mean Square Error of Approximation:
RMSEA 0.205 0.173
90 Percent Confidence Interval 0.189 0.222 0.160 0.188
P-value RMSEA <= 0.05 0.000 0.000
Robust RMSEA 0.205
90 Percent Confidence Interval 0.185 0.224
Standardized Root Mean Square Residual:
SRMR 0.086 0.086
This compares the estimated model to the one that has no covariances. It is a last-resort type model in that if the estimated model has a test with a non-significant p-value, then this indicates there will be no factors to be found in the data.
Model test baseline model:
Minimum Function Test Statistic 1981.034 1128.693
Degrees of freedom 15 15
P-value 0.000 0.000
Where does this χ2 value for “fit of the baseline model” come from? A rescaled −2LL model comparison of the independence model with NO covariances to the saturated model:
Step 2: Scaling correction = [ (#parmsfewerscalefewer) – (#parmsmorescalemore) ] / (#parmsfewer – #parmsmore) = [ (12 * 0.9725) – (27 * 1.4073) ] / (12 – 27) = –26.372 / −15 = 1.7551
Step 4: Difference in df = #parmsmore – #parmsfewer = 27 – 12 = 15
What’s the point? This baseline model fit test tells us whether there are any covariances at all (i.e., whether it even makes sense to try to fit latent factors to predict them).
modelIndependence = "
# Item intercepts --> ~ 1 indicates means or intercepts
# You can put multiple lines of syntax on a single line using a ;
Sit1r ~ 1; Sit2 ~ 1; Sit3r ~ 1; Sit4 ~ 1; Sit5r ~ 1; Sit6 ~ 1;
# Item variances and covariances --> use the ~~ command
# Variances:
Sit1r ~~ Sit1r; Sit2 ~~ Sit2; Sit3r ~~ Sit3r; Sit4 ~~ Sit4; Sit5r ~~ Sit5r; Sit6 ~~ Sit6;
"
modelIndependenceEstimates = sem(model = modelIndependence, data = hfsSituations, estimator = "MLR", mimic = "mplus")
summary(modelIndependenceEstimates, fit.measures = TRUE)
Note how the test of the model vs. saturated is identical to the test of the model vs. baseline: that indicates this model is the baseline model.
Although not 0, this is the worst possible RMSEA while still allowing separate means and variances per item in these data. RMSEA is a parsimony-corrected absolute fit index (so, its fit is relative to the saturated model).
CFI and TLI are 0 because they are “incremental fit” indices relative to the independence model (which this is).
SRMR is also an absolute fit index (relative to saturated model), so this is the worst it gets for these data, too.
For this section, any of the three equivalent single factor models can be used.
So global fit for the one-factor model is not so good (RMSEA = .173, CFI = .732). What do the voo-doo modification indices suggest we do to fix it? To get modification indices, use the modificationindices()
function. Here, the function is called with options:
object = model01Estimates
: The model estimates object from which to derive modification indices.sort. = TRUE
: Sort the output from highest to lowest modification index.modificationindices(object = model01Estimates, sort. = TRUE)
lhs op rhs mi mi.scaled epc sepc.lv sepc.all sepc.nox
27 Sit2 ~~ Sit4 224.295 161.330 0.702 0.702 0.363 0.363
22 Sit1r ~~ Sit3r 199.679 143.624 1.023 1.023 0.368 0.368
29 Sit2 ~~ Sit6 88.836 63.897 0.486 0.486 0.230 0.230
21 Sit1r ~~ Sit2 68.985 49.619 -0.464 -0.464 -0.193 -0.193
34 Sit4 ~~ Sit6 64.710 46.544 0.415 0.415 0.193 0.193
23 Sit1r ~~ Sit4 50.442 36.281 -0.403 -0.403 -0.164 -0.164
26 Sit2 ~~ Sit3r 48.505 34.888 -0.357 -0.357 -0.162 -0.162
30 Sit3r ~~ Sit4 40.616 29.214 -0.336 -0.336 -0.150 -0.150
25 Sit1r ~~ Sit6 33.475 24.078 -0.358 -0.358 -0.134 -0.134
32 Sit3r ~~ Sit6 31.132 22.392 -0.319 -0.319 -0.131 -0.131
28 Sit2 ~~ Sit5r 7.117 5.119 -0.151 -0.151 -0.064 -0.064
33 Sit4 ~~ Sit5r 6.906 4.967 -0.149 -0.149 -0.062 -0.062
24 Sit1r ~~ Sit5r 6.417 4.615 0.176 0.176 0.059 0.059
31 Sit3r ~~ Sit5r 3.548 2.552 0.123 0.123 0.045 0.045
35 Sit5r ~~ Sit6 0.736 0.529 -0.053 -0.053 -0.020 -0.020
The output includes:
lhs
: Left hand side for parameter (in lavaan
syntax)op
: Operation of parameter: ~~
is for (residual) covariances; =~
is for additional factor loadings. All other lavaan
syntax types are possible, too, but here item means and variances fit perfectly (see below)rhs
: Right hand side for parameter (in lavaan
syntax)mi
: Modification index using MLmi.scaled
: Modification index using MLR (use this one)epc
: Expected parameter change (from zero)sepc.lv
: Standardized expected parameter change using std.lv
standardizationsepc.all
: Standardized expected parameter change using std.all
standardizationsepc.nox
: Standardized expected parameter change using std.nox
standardizationHighest values are for error covariances (for unknown multidimensionality).
Another approach—how about we examine local fit and see where the problems seem to be?
The means and variances of the items will be perfectly reproduced, so that’s not an issue. Misfit results from the difference between the observed and model-predicted covariances.
lavaan
gives us the “residual” (defined as observed – predicted) or “leftover” covariance matrix, but it is scale dependent and thus not so helpful. We can use the resid()
function to obtain the residual covariance matrix (observed covariances minus model-implied covariances):
resid(object = model01Estimates, type = "raw")
$type
[1] "raw"
$cov
Sit1r Sit2 Sit3r Sit4 Sit5r Sit6
Sit1r 0.000
Sit2 -0.290 0.000
Sit3r 0.271 -0.174 0.000
Sit4 -0.234 0.552 -0.149 0.000
Sit5r 0.096 -0.114 0.050 -0.108 0.000
Sit6 -0.216 0.390 -0.149 0.319 -0.040 0.000
$mean
Sit1r Sit2 Sit3r Sit4 Sit5r Sit6
0 0 0 0 0 0
lavaan
also gives us “normalized” residuals, which can be thought of as z-scores for how large the residual leftover covariance is in absolute terms. Because the denominator decreases with sample size, however, these values may be inflated in large samples, so look for relatively large values.
“Normalized” Residuals for Inter-Item Covariances = (observed – predicted) / SD(observed):
resid(object = model01Estimates, type = "normalized")
$type
[1] "normalized"
$cov
Sit1r Sit2 Sit3r Sit4 Sit5r Sit6
Sit1r 0.000
Sit2 -3.503 0.000
Sit3r 2.977 -2.253 0.000
Sit4 -2.928 6.560 -1.959 0.000
Sit5r 0.960 -1.434 0.548 -1.372 0.000
Sit6 -2.345 4.721 -1.756 3.925 -0.444 0.000
$mean
Sit1r Sit2 Sit3r Sit4 Sit5r Sit6
0 0 0 0 0 0
NEGATIVE NORMALIZED RESIDUAL: Less related than you predicted (don’t want to be together) POSITIVE NORMALIZED RESIDUAL: More related than you predicted (want to be more together)
Why might the normalized residuals (leftover correlations) for the positive-worded items be larger than for the negatively-worded items?
These results suggest that wording valence is playing a larger role in the pattern of covariance across items than what the one-factor model predicts. Rather than adding voo-doo covariances among the residuals for specific items, how about a two-factor model based on wording instead?
Here, we use the shortened syntax and the sem()
function to get results.
model04Syntax = "
# SitP loadings (all estimated)
SitP =~ Sit2 + Sit4 + Sit6
# SitN loadings (all estimated)
SitN =~ Sit1r + Sit3r + Sit5r
"
model04Estimates = sem(model = model04Syntax, data = hfsSituations, estimator = "MLR", mimic = "mplus", std.lv = TRUE)
summary(model04Estimates, fit.measures = TRUE, rsquare = TRUE, standardized = TRUE)
lavaan (0.5-23.1097) converged normally after 23 iterations
Number of observations 1103
Number of missing patterns 1
Estimator ML Robust
Minimum Function Test Statistic 35.410 24.924
Degrees of freedom 8 8
P-value (Chi-square) 0.000 0.002
Scaling correction factor 1.421
for the Yuan-Bentler correction (Mplus variant)
Model test baseline model:
Minimum Function Test Statistic 1981.034 1128.693
Degrees of freedom 15 15
P-value 0.000 0.000
User model versus baseline model:
Comparative Fit Index (CFI) 0.986 0.985
Tucker-Lewis Index (TLI) 0.974 0.972
Robust Comparative Fit Index (CFI) 0.988
Robust Tucker-Lewis Index (TLI) 0.977
Loglikelihood and Information Criteria:
Loglikelihood user model (H0) -11340.140 -11340.140
Scaling correction factor 1.402
for the MLR correction
Loglikelihood unrestricted model (H1) -11322.435 -11322.435
Scaling correction factor 1.407
for the MLR correction
Number of free parameters 19 19
Akaike (AIC) 22718.281 22718.281
Bayesian (BIC) 22813.391 22813.391
Sample-size adjusted Bayesian (BIC) 22753.042 22753.042
Root Mean Square Error of Approximation:
RMSEA 0.056 0.044
90 Percent Confidence Interval 0.038 0.075 0.028 0.061
P-value RMSEA <= 0.05 0.277 0.708
Robust RMSEA 0.052
90 Percent Confidence Interval 0.030 0.076
Standardized Root Mean Square Residual:
SRMR 0.029 0.029
Parameter Estimates:
Information Observed
Standard Errors Robust.huber.white
Latent Variables:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
SitP =~
Sit2 1.007 0.052 19.487 0.000 1.007 0.730
Sit4 1.064 0.050 21.195 0.000 1.064 0.759
Sit6 0.956 0.053 18.203 0.000 0.956 0.625
SitN =~
Sit1r 1.325 0.048 27.698 0.000 1.325 0.759
Sit3r 1.349 0.044 30.514 0.000 1.349 0.846
Sit5r 1.009 0.055 18.358 0.000 1.009 0.588
Covariances:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
SitP ~~
SitN 0.564 0.041 13.775 0.000 0.564 0.564
Intercepts:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.Sit2 5.289 0.042 127.346 0.000 5.289 3.834
.Sit4 5.359 0.042 126.896 0.000 5.359 3.821
.Sit6 5.321 0.046 115.492 0.000 5.321 3.477
.Sit1r 4.547 0.053 86.474 0.000 4.547 2.604
.Sit3r 4.896 0.048 101.959 0.000 4.896 3.070
.Sit5r 4.860 0.052 94.060 0.000 4.860 2.832
SitP 0.000 0.000 0.000
SitN 0.000 0.000 0.000
Variances:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.Sit2 0.888 0.097 9.173 0.000 0.888 0.467
.Sit4 0.835 0.093 9.003 0.000 0.835 0.425
.Sit6 1.428 0.134 10.684 0.000 1.428 0.610
.Sit1r 1.294 0.103 12.547 0.000 1.294 0.425
.Sit3r 0.724 0.092 7.857 0.000 0.724 0.285
.Sit5r 1.926 0.119 16.128 0.000 1.926 0.654
SitP 1.000 1.000 1.000
SitN 1.000 1.000 1.000
R-Square:
Estimate
Sit2 0.533
Sit4 0.575
Sit6 0.390
Sit1r 0.575
Sit3r 0.715
Sit5r 0.346
This model fits better (we would call this good fit). We can also compare the fit of this model with Model 1 using the anova()
function. Here, the null hypothesis is that Model 1 fits as well as Model 4.
anova(model01Estimates, model04Estimates)
Scaled Chi Square Difference Test (method = "satorra.bentler.2001")
Df AIC BIC Chisq Chisq diff Df diff Pr(>Chisq)
model04Estimates 8 22718 22813 35.41
model01Estimates 9 23109 23199 427.94 342.29 1 < 2.2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
This function conducts the MLR version of the likelihood ratio test for us. The significant difference indicates that Model 4 fits better than Model 1.
Let’s see: Any more local fit problems? The modification indices are better (values of a little over 10 seem okay).
modificationindices(object = model04Estimates, sort. = TRUE)
lhs op rhs mi mi.scaled epc sepc.lv sepc.all sepc.nox
29 SitN =~ Sit6 15.383 10.827 0.245 0.245 0.160 0.160
30 Sit2 ~~ Sit4 15.382 10.827 0.332 0.332 0.172 0.172
35 Sit4 ~~ Sit6 13.886 9.774 -0.273 -0.273 -0.127 -0.127
27 SitN =~ Sit2 13.886 9.773 -0.224 -0.224 -0.162 -0.162
24 SitP =~ Sit1r 9.244 6.507 -0.223 -0.223 -0.128 -0.128
44 Sit3r ~~ Sit5r 9.244 6.507 -0.277 -0.277 -0.101 -0.101
42 Sit1r ~~ Sit3r 7.549 5.313 0.409 0.409 0.147 0.147
26 SitP =~ Sit5r 7.549 5.313 0.191 0.191 0.111 0.111
32 Sit2 ~~ Sit1r 6.685 4.705 -0.115 -0.115 -0.048 -0.048
41 Sit6 ~~ Sit5r 5.847 4.115 0.138 0.138 0.052 0.052
40 Sit6 ~~ Sit3r 1.321 0.930 0.052 0.052 0.021 0.021
43 Sit1r ~~ Sit5r 0.667 0.469 0.071 0.071 0.024 0.024
25 SitP =~ Sit3r 0.667 0.469 0.060 0.060 0.037 0.037
33 Sit2 ~~ Sit3r 0.402 0.283 -0.025 -0.025 -0.012 -0.012
39 Sit6 ~~ Sit1r 0.346 0.244 0.030 0.030 0.011 0.011
36 Sit4 ~~ Sit1r 0.307 0.216 -0.025 -0.025 -0.010 -0.010
38 Sit4 ~~ Sit5r 0.201 0.142 0.022 0.022 0.009 0.009
34 Sit2 ~~ Sit5r 0.129 0.091 0.017 0.017 0.007 0.007
37 Sit4 ~~ Sit3r 0.101 0.071 0.013 0.013 0.006 0.006
31 Sit2 ~~ Sit6 0.023 0.016 0.010 0.010 0.005 0.005
28 SitN =~ Sit4 0.023 0.016 0.009 0.009 0.007 0.007
The normalized residuals have two values that are bigger in absolute value than 2:
resid(object = model04Estimates, type = "normalized")
$type
[1] "normalized"
$cov
Sit2 Sit4 Sit6 Sit1r Sit3r Sit5r
Sit2 0.000
Sit4 0.370 0.000
Sit6 0.031 -0.676 0.000
Sit1r -2.125 -0.768 0.870 0.000
Sit3r -0.896 0.192 1.658 0.172 0.000
Sit5r 0.382 1.128 2.847 0.212 -0.464 0.000
$mean
Sit2 Sit4 Sit6 Sit1r Sit3r Sit5r
0 0 0 0 0 0
Because we have no real theoretical or defendable reason to fit any of these suggested parameters, we will not add any new parameters. This will be about as good as it gets.
As we have demonstrated good model fit, estimates of trait reliabilities can now be found. Although I do not dispense such advice, many people use a CFA to indicate model fit then proceed to use sum scores for each trait in a model. I do not like this because sum scores are perfectly correlated with the factor scores from the parallel items model – but we never tested that model (should it not fit we may get a reversal in the rank ordering of people by score).
That said, reliabilities for sum scores can be found by using CFA model parameters. This reliability coefficient is called “Omega” by Rod McDonald and is calculated:
\[ \rho_\omega = \frac{\sigma^2_F\left( \sum_{i=1}^{Items} \lambda_i \right)^2}{\sigma^2_F\left( \sum_{i=1}^{Items} \lambda_i \right)^2 \sum_{i=1}^{Items} \sum_{j=1}^{Items} \psi_{ij}}\]
In words, Omega = Var(Factor)* (Sum of loadings)^2 / [ Var(Factor)* (Sum of loadings)^2 + Sum of error variances + 2* Sum of error covariances]
For our data:
(alpha was .746)
(alpha was .780)
lavaan
Through the use of parameter labels and new parameters, you can use lavaan
to calculate Omega. In this syntax, the terms before the items are labels representing the factor loadings and unique variances. They can be used later in the syntax to form new parameters or to put constraints on the model.
model04SyntaxOmega = "
# SitP loadings (all estimated)
SitP =~ L2*Sit2 + L4*Sit4 + L6*Sit6
# SitN loadings (all estimated)
SitN =~ L1*Sit1r + L3*Sit3r + L5*Sit5r
# Unique Variances:
Sit1r ~~ E1*Sit1r; Sit2 ~~ E2*Sit2; Sit3r ~~ E3*Sit3r; Sit4 ~~ E4*Sit4; Sit5r ~~ E5*Sit5r; Sit6 ~~ E6*Sit6;
# Calculate Omega Reliability for Sum Scores:
OmegaP := ((L2 + L4 + L6)^2) / ( ((L2 + L4 + L6)^2) + E2 + E4 + E6)
OmegaN := ((L1 + L3 + L5)^2) / ( ((L1 + L3 + L5)^2) + E1 + E3 + E5)
"
model04EstimatesOmega = sem(model = model04SyntaxOmega, data = hfsSituations, estimator = "MLR", mimic = "mplus", std.lv = TRUE)
summary(model04EstimatesOmega, fit.measures = FALSE, rsquare = FALSE, standardized = FALSE, header = FALSE)
Parameter Estimates:
Information Observed
Standard Errors Robust.huber.white
Latent Variables:
Estimate Std.Err z-value P(>|z|)
SitP =~
Sit2 (L2) 1.007 0.052 19.487 0.000
Sit4 (L4) 1.064 0.050 21.195 0.000
Sit6 (L6) 0.956 0.053 18.203 0.000
SitN =~
Sit1r (L1) 1.325 0.048 27.698 0.000
Sit3r (L3) 1.349 0.044 30.514 0.000
Sit5r (L5) 1.009 0.055 18.358 0.000
Covariances:
Estimate Std.Err z-value P(>|z|)
SitP ~~
SitN 0.564 0.041 13.775 0.000
Intercepts:
Estimate Std.Err z-value P(>|z|)
.Sit2 5.289 0.042 127.346 0.000
.Sit4 5.359 0.042 126.896 0.000
.Sit6 5.321 0.046 115.492 0.000
.Sit1r 4.547 0.053 86.474 0.000
.Sit3r 4.896 0.048 101.959 0.000
.Sit5r 4.860 0.052 94.060 0.000
SitP 0.000
SitN 0.000
Variances:
Estimate Std.Err z-value P(>|z|)
.Sit1r (E1) 1.294 0.103 12.547 0.000
.Sit2 (E2) 0.888 0.097 9.173 0.000
.Sit3r (E3) 0.724 0.092 7.857 0.000
.Sit4 (E4) 0.835 0.093 9.003 0.000
.Sit5r (E5) 1.926 0.119 16.128 0.000
.Sit6 (E6) 1.428 0.134 10.684 0.000
SitP 1.000
SitN 1.000
Defined Parameters:
Estimate Std.Err z-value P(>|z|)
OmegaP 0.744 0.020 37.956 0.000
OmegaN 0.775 0.014 56.802 0.000
Calculating Omega through `lavaan
also provides its standard error. Note: the Omega equation used above is for the Z-score factor identification method. If the marker-item method is used for factor loadings, then multiply the sum of the loadings by the variance of the factor, as shown in the equation above.
Of note about Omega: If we constrain all factor loadings of a trait to be equal (called Tau-Equivalent), Omega is equal to Alpha. In the syntax below, putting the labels LP and LN in multiple places causes the factor loadings for those places to be constrained to be equal:
model04SyntaxAlpha = "
# SitP loadings (all estimated)
SitP =~ LP*Sit2 + LP*Sit4 + LP*Sit6
# SitN loadings (all estimated)
SitN =~ LN*Sit1r + LN*Sit3r + LN*Sit5r
# Unique Variances:
Sit1r ~~ E1*Sit1r; Sit2 ~~ E2*Sit2; Sit3r ~~ E3*Sit3r; Sit4 ~~ E4*Sit4; Sit5r ~~ E5*Sit5r; Sit6 ~~ E6*Sit6;
# Calculate Omega Reliability for Sum Scores:
OmegaP := ((3*LP)^2) / ( ((3*LP)^2) + E2 + E4 + E6)
OmegaN := ((3*LN)^2) / ( ((3*LN)^2) + E1 + E3 + E5)
"
model04EstimatesAlpha = sem(model = model04SyntaxAlpha, data = hfsSituations, estimator = "MLR", mimic = "mplus", std.lv = TRUE)
summary(model04EstimatesAlpha, fit.measures = TRUE, rsquare = TRUE, standardized = TRUE)
lavaan (0.5-23.1097) converged normally after 18 iterations
Number of observations 1103
Number of missing patterns 1
Estimator ML Robust
Minimum Function Test Statistic 73.500 55.487
Degrees of freedom 12 12
P-value (Chi-square) 0.000 0.000
Scaling correction factor 1.325
for the Yuan-Bentler correction (Mplus variant)
Model test baseline model:
Minimum Function Test Statistic 1981.034 1128.693
Degrees of freedom 15 15
P-value 0.000 0.000
User model versus baseline model:
Comparative Fit Index (CFI) 0.969 0.961
Tucker-Lewis Index (TLI) 0.961 0.951
Robust Comparative Fit Index (CFI) 0.971
Robust Tucker-Lewis Index (TLI) 0.963
Loglikelihood and Information Criteria:
Loglikelihood user model (H0) -11359.185 -11359.185
Scaling correction factor 1.163
for the MLR correction
Loglikelihood unrestricted model (H1) -11322.435 -11322.435
Scaling correction factor 1.407
for the MLR correction
Number of free parameters 15 15
Akaike (AIC) 22748.370 22748.370
Bayesian (BIC) 22823.457 22823.457
Sample-size adjusted Bayesian (BIC) 22775.813 22775.813
Root Mean Square Error of Approximation:
RMSEA 0.068 0.057
90 Percent Confidence Interval 0.054 0.084 0.044 0.071
P-value RMSEA <= 0.05 0.021 0.167
Robust RMSEA 0.066
90 Percent Confidence Interval 0.049 0.084
Standardized Root Mean Square Residual:
SRMR 0.061 0.061
Parameter Estimates:
Information Observed
Standard Errors Robust.huber.white
Latent Variables:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
SitP =~
Sit2 (LP) 1.014 0.036 28.384 0.000 1.014 0.734
Sit4 (LP) 1.014 0.036 28.384 0.000 1.014 0.733
Sit6 (LP) 1.014 0.036 28.384 0.000 1.014 0.653
SitN =~
Sit1r (LN) 1.254 0.032 38.954 0.000 1.254 0.735
Sit3r (LN) 1.254 0.032 38.954 0.000 1.254 0.804
Sit5r (LN) 1.254 0.032 38.954 0.000 1.254 0.682
Covariances:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
SitP ~~
SitN 0.578 0.041 14.230 0.000 0.578 0.578
Intercepts:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.Sit2 5.289 0.042 127.346 0.000 5.289 3.827
.Sit4 5.359 0.042 126.896 0.000 5.359 3.872
.Sit6 5.321 0.046 115.492 0.000 5.321 3.427
.Sit1r 4.547 0.053 86.474 0.000 4.547 2.667
.Sit3r 4.896 0.048 101.959 0.000 4.896 3.141
.Sit5r 4.860 0.052 94.060 0.000 4.860 2.645
SitP 0.000 0.000 0.000
SitN 0.000 0.000 0.000
Variances:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.Sit1r (E1) 1.335 0.083 16.168 0.000 1.335 0.459
.Sit2 (E2) 0.882 0.083 10.610 0.000 0.882 0.462
.Sit3r (E3) 0.857 0.069 12.339 0.000 0.857 0.353
.Sit4 (E4) 0.887 0.075 11.779 0.000 0.887 0.463
.Sit5r (E5) 1.805 0.115 15.710 0.000 1.805 0.534
.Sit6 (E6) 1.382 0.118 11.703 0.000 1.382 0.573
SitP 1.000 1.000 1.000
SitN 1.000 1.000 1.000
R-Square:
Estimate
Sit1r 0.541
Sit2 0.538
Sit3r 0.647
Sit4 0.537
Sit5r 0.466
Sit6 0.427
Defined Parameters:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
OmegaP 0.746 0.020 38.193 0.000 0.746 0.764
OmegaN 0.780 0.013 58.288 0.000 0.780 0.783
Compared with what the psych
package gives:
alpha(x = hfsSituations[c("Sit2", "Sit4", "Sit6")], use = "all.obs")
Reliability analysis
Call: alpha(x = hfsSituations[c("Sit2", "Sit4", "Sit6")], use = "all.obs")
raw_alpha std.alpha G6(smc) average_r S/N ase mean sd
0.74 0.74 0.67 0.49 2.9 0.014 5.3 1.2
lower alpha upper 95% confidence boundaries
0.71 0.74 0.77
Reliability if an item is dropped:
raw_alpha std.alpha G6(smc) average_r S/N alpha se
Sit2 0.62 0.62 0.45 0.45 1.6 0.023
Sit4 0.63 0.63 0.46 0.46 1.7 0.022
Sit6 0.73 0.73 0.57 0.57 2.7 0.016
Item statistics
n raw.r std.r r.cor r.drop mean sd
Sit2 1103 0.82 0.83 0.71 0.60 5.3 1.4
Sit4 1103 0.82 0.83 0.70 0.59 5.4 1.4
Sit6 1103 0.80 0.78 0.59 0.51 5.3 1.5
Non missing response frequency for each item
1 2 3 4 5 6 7 miss
Sit2 0.02 0.02 0.05 0.13 0.31 0.24 0.22 0
Sit4 0.02 0.02 0.04 0.15 0.27 0.24 0.25 0
Sit6 0.03 0.03 0.06 0.14 0.23 0.25 0.27 0
alpha(x = hfsSituations[c("Sit1r", "Sit3r", "Sit5r")], use = "all.obs")
Reliability analysis
Call: alpha(x = hfsSituations[c("Sit1r", "Sit3r", "Sit5r")], use = "all.obs")
raw_alpha std.alpha G6(smc) average_r S/N ase mean sd
0.77 0.77 0.7 0.53 3.3 0.012 4.8 1.4
lower alpha upper 95% confidence boundaries
0.74 0.77 0.79
Reliability if an item is dropped:
raw_alpha std.alpha G6(smc) average_r S/N alpha se
Sit1r 0.65 0.65 0.48 0.48 1.9 0.021
Sit3r 0.62 0.62 0.45 0.45 1.7 0.023
Sit5r 0.78 0.79 0.65 0.65 3.7 0.013
Item statistics
n raw.r std.r r.cor r.drop mean sd
Sit1r 1103 0.85 0.85 0.74 0.63 4.5 1.7
Sit3r 1103 0.85 0.86 0.76 0.66 4.9 1.6
Sit5r 1103 0.78 0.78 0.58 0.51 4.9 1.7
Non missing response frequency for each item
1 2 3 4 5 6 7 miss
Sit1r 0.05 0.08 0.16 0.19 0.18 0.17 0.17 0
Sit3r 0.03 0.05 0.12 0.20 0.21 0.20 0.20 0
Sit5r 0.04 0.06 0.13 0.18 0.18 0.19 0.22 0
Factor scores have a long history with many critiques and complaints. That said, factor scores can be constructed analogously across measurement models using Bayesian estmates. We will describe this more in a later lecture. For our purposes, in CFA, the two main types of Bayesian Estimates (MAP: Maximum A Posteriori and EAP: Expected A Posteriori) are identical. lavaan
will not provide standard errors for factor scores, so I created a function factorScores()
to not only create the factor scores using EAP (again, identical to MAP) but also to give standard errors for each score. The function takes the lavaan
model output and returns a list of three elements: scores
containing factor scores and standard errors, factorCov
containing the theoretical covariance matrix of the factor scores, and factorCor
the theoretical correlation matrix of the factor scores.
To determine the reliability of a factor score, we must return to our classical notion of what reliability means: \[ \rho = \frac{Var(True)}{Var(Total)} = \frac{Var(True)}{Var(True) + Var(Error)} = \frac{Var(F)}{Var(F)+SE(F)^2}\] In CFA, \(Var(True)\) is found by the estimated factor variance and \(Var(Error)\) comes from the squared standard error of a factor score (from an observation with complete data).
I created a function named factorScoreReliability()
to calculate the reliability for factor scores. Using the function we can see that the reliabiltiy of our factor scores is .817 for SitP and .851 for SitN, which har higher than the Omega reliablities for sum scores. In general, factor score reliabilities are greater than what is found in Omega and Alpha due to the use of a prior distribution for the factor itself and due to the factor correlation allowing indirect information from other factors to help in the estimation of each factor.
factorScores = function(lavObject){
output = inspect(object = lavObject, what = "est")
sigma = output$lambda %*% output$psi %*% t(output$lambda) + output$theta
modelData = lavObject@Data@X[[1]]
scores = t(output$alpha%*%matrix(1, nrow=1, ncol=dim(modelData)[1]) +
output$psi %*% t(output$lambda) %*% solve(sigma)%*%(t(modelData) - output$nu%*%matrix(1, nrow=1, ncol=dim(modelData)[1])))
varscores = output$psi - output$psi %*% t(output$lambda) %*% solve(sigma) %*% output$lambda %*% output$psi
factorSE = sqrt(diag(varscores))
names(factorSE) = paste0(names(factorSE), ".SE")
factorSEmat = matrix(1, nrow=nrow(scores), ncol = 1) %*% matrix(factorSE, nrow = 1, ncol = ncol(scores))
colnames(factorSEmat) = names(factorSE)
result = data.frame(cbind(scores, factorSEmat))
names(result)
odds = seq(1, ncol(result)-1, 2)
evens = seq(2, ncol(result), 2)
result = result[c(odds,evens)]
factorCov = varscores
factorCorr = solve(sqrt(diag(diag(varscores)))) %*% varscores %*% solve(sqrt(diag(diag(varscores))))
return(list(scores = result, factorCov = factorCov, factorCorr = factorCorr))
}
factorScoreReliability = function(lavObject){
output = inspect(object = lavObject, what = "est")
sigma = output$lambda %*% output$psi %*% t(output$lambda) + output$theta
varscores = output$psi - output$psi %*% t(output$lambda) %*% solve(sigma) %*% output$lambda %*% output$psi
return(diag(output$psi)/(diag(output$psi) + diag(varscores)))
}
factorScoreReliability(lavObject = model04Estimates)
SitP SitN
0.8177844 0.8510698
We can also compare the model fit of the CFA model with the tau-equivalent model, essentially testing whether or not CTT (and alpha) is appropriate:
anova(model04EstimatesAlpha, model04EstimatesOmega)
Scaled Chi Square Difference Test (method = "satorra.bentler.2001")
Df AIC BIC Chisq Chisq diff Df diff Pr(>Chisq)
model04EstimatesOmega 8 22718 22813 35.41
model04EstimatesAlpha 12 22748 22824 73.50 33.635 4 8.851e-07 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
The significant difference indicates that the tau-equivalent assumptions of equal item loadings do not hold for all of the data. In the Mplus handout, Lesa Hoffman shows that tau equivalence holds for one of the subscales, but we don’t need to have tau equivalence to move forward with results as CFA is subsumes the tau-equivalent model (meaning it will be tau equivalent if the data are tau equivalent but can also be more general).
Another model often discussed in CTT is the parallel items model. The CFA version of the parallel items model is a model where all factor loadings for a factor are constrained to be equal and all unique variances for a factor are constrained to be equal. This model is one step more restrictive than the tau equivalent items model. Syntax for this model is as follows:
model04SyntaxSB = "
# SitP loadings (all estimated)
SitP =~ LP*Sit2 + LP*Sit4 + LP*Sit6
# SitN loadings (all estimated)
SitN =~ LN*Sit1r + LN*Sit3r + LN*Sit5r
# Unique Variances:
Sit1r ~~ EN*Sit1r; Sit2 ~~ EP*Sit2; Sit3r ~~ EN*Sit3r; Sit4 ~~ EP*Sit4; Sit5r ~~ EN*Sit5r; Sit6 ~~ EP*Sit6;
# Calculate Omega Reliability for Sum Scores:
OmegaP := ((3*LP)^2) / ( ((3*LP)^2) + 3*EP)
OmegaN := ((3*LN)^2) / ( ((3*LN)^2) + 3*EN)
"
model04EstimatesSB = sem(model = model04SyntaxSB, data = hfsSituations, estimator = "MLR", mimic = "mplus", std.lv = TRUE)
summary(model04EstimatesSB, fit.measures = TRUE, rsquare = TRUE, standardized = TRUE)
lavaan (0.5-23.1097) converged normally after 12 iterations
Number of observations 1103
Number of missing patterns 1
Estimator ML Robust
Minimum Function Test Statistic 181.247 132.341
Degrees of freedom 16 16
P-value (Chi-square) 0.000 0.000
Scaling correction factor 1.370
for the Yuan-Bentler correction (Mplus variant)
Model test baseline model:
Minimum Function Test Statistic 1981.034 1128.693
Degrees of freedom 15 15
P-value 0.000 0.000
User model versus baseline model:
Comparative Fit Index (CFI) 0.916 0.896
Tucker-Lewis Index (TLI) 0.921 0.902
Robust Comparative Fit Index (CFI) 0.918
Robust Tucker-Lewis Index (TLI) 0.924
Loglikelihood and Information Criteria:
Loglikelihood user model (H0) -11413.059 -11413.059
Scaling correction factor 0.847
for the MLR correction
Loglikelihood unrestricted model (H1) -11322.435 -11322.435
Scaling correction factor 1.407
for the MLR correction
Number of free parameters 11 11
Akaike (AIC) 22848.117 22848.117
Bayesian (BIC) 22903.181 22903.181
Sample-size adjusted Bayesian (BIC) 22868.242 22868.242
Root Mean Square Error of Approximation:
RMSEA 0.097 0.081
90 Percent Confidence Interval 0.084 0.110 0.070 0.092
P-value RMSEA <= 0.05 0.000 0.000
Robust RMSEA 0.095
90 Percent Confidence Interval 0.080 0.110
Standardized Root Mean Square Residual:
SRMR 0.088 0.088
Parameter Estimates:
Information Observed
Standard Errors Robust.huber.white
Latent Variables:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
SitP =~
Sit2 (LP) 1.005 0.035 28.455 0.000 1.005 0.698
Sit4 (LP) 1.005 0.035 28.455 0.000 1.005 0.698
Sit6 (LP) 1.005 0.035 28.455 0.000 1.005 0.698
SitN =~
Sit1r (LN) 1.222 0.032 38.250 0.000 1.222 0.724
Sit3r (LN) 1.222 0.032 38.250 0.000 1.222 0.724
Sit5r (LN) 1.222 0.032 38.250 0.000 1.222 0.724
Covariances:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
SitP ~~
SitN 0.596 0.041 14.422 0.000 0.596 0.596
Intercepts:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.Sit2 5.289 0.042 127.346 0.000 5.289 3.676
.Sit4 5.359 0.042 126.896 0.000 5.359 3.724
.Sit6 5.321 0.046 115.492 0.000 5.321 3.698
.Sit1r 4.547 0.053 86.474 0.000 4.547 2.695
.Sit3r 4.896 0.048 101.959 0.000 4.896 2.902
.Sit5r 4.860 0.052 94.060 0.000 4.860 2.881
SitP 0.000 0.000 0.000
SitN 0.000 0.000 0.000
Variances:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.Sit1r (EN) 1.353 0.060 22.722 0.000 1.353 0.475
.Sit2 (EP) 1.060 0.061 17.452 0.000 1.060 0.512
.Sit3r (EN) 1.353 0.060 22.722 0.000 1.353 0.475
.Sit4 (EP) 1.060 0.061 17.452 0.000 1.060 0.512
.Sit5r (EN) 1.353 0.060 22.722 0.000 1.353 0.475
.Sit6 (EP) 1.060 0.061 17.452 0.000 1.060 0.512
SitP 1.000 1.000 1.000
SitN 1.000 1.000 1.000
R-Square:
Estimate
Sit1r 0.525
Sit2 0.488
Sit3r 0.525
Sit4 0.488
Sit5r 0.525
Sit6 0.488
Defined Parameters:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
OmegaP 0.741 0.020 36.909 0.000 0.741 0.741
OmegaN 0.768 0.015 52.547 0.000 0.768 0.768
The values resulting from Omega are the Spearman-Brown reliablity estimates. Note that reliablity for parallel items < reliability for tau equivalent items < reliability for CFA, but that the estimates are very close. That is common when a model fits the data.
The positive factor scores have an estimated mean of 0 with a variance of 0.78 instead of 1.00 (due to the effect of the prior distribution – this is called “shrinkage”). The SE for each person’s factor score is .472. Treating factor scores as observed variables is like saying SE = 0.
95% confidence interval for positive factor score = Score ± 2*.472 = Score ± .944.
# calculate factor scores using function above:
fscores = factorScores(lavObject = model04Estimates)
#show variance of factor score:
var(fscores$scores$SitP)
[1] 0.7778887
# Histogram overlaid with kernel density curve
ggplot(fscores$scores, aes(x=SitP)) +
geom_histogram(aes(y=..density..), # Histogram with density instead of count on y-axis
binwidth=.5,
colour="black", fill="white") + xlim(c(-4,4)) + labs(title = "Positive Situation Forgiveness Factor Score") +
geom_density(alpha=.2, fill="#FF6666") # Overlay with transparent density plot
The negative factor scores have an estimated mean of 0 with a variance of 0.825 instead of 1.00. The SE for each person’s factor score is .418, so the 80% confidence interval is Score ± .836.
The negative factor scores retain more variance (and have a smaller SE) because there is more information in them, due to higher factor loadings (greater reliability) of their items.
#show variance of factor score:
var(fscores$scores$SitN)
[1] 0.8257565
# Histogram overlaid with kernel density curve
ggplot(fscores$scores, aes(x=SitN)) +
geom_histogram(aes(y=..density..), # Histogram with density instead of count on y-axis
binwidth=.5,
colour="black", fill="white") + xlim(c(-4,4)) + labs(title = "Negative Situation Forgiveness Factor Score") +
geom_density(alpha=.2, fill="#FF6666") # Overlay with transparent density plot
The parallel items model constrains the item factor loadings to be equal and unique variances to be equal for a given factor. When estimated separately (or when when estimated with a factor correlation), the corrlation of factor scores from the parallel items model and the sum scores from CTT is equal to one. What this means is that to be able to use sum scores in an analysis, the parallel items model must hold. To demonstrate, the lavaan
syntax below sets the factor covariance to zero (and therefore the correlation between factors to zero) by using 0* in the statement SitP ~~ 0*SitN
:
model04SyntaxSBzeroCor = "
# SitP loadings (all estimated)
SitP =~ LP*Sit2 + LP*Sit4 + LP*Sit6
# SitN loadings (all estimated)
SitN =~ LN*Sit1r + LN*Sit3r + LN*Sit5r
# Unique Variances:
Sit1r ~~ EN*Sit1r; Sit2 ~~ EP*Sit2; Sit3r ~~ EN*Sit3r; Sit4 ~~ EP*Sit4; Sit5r ~~ EN*Sit5r; Sit6 ~~ EP*Sit6;
SitP ~~ 0*SitN
# Calculate Omega Reliability for Sum Scores:
OmegaP := ((3*LP)^2) / ( ((3*LP)^2) + 3*EP)
OmegaN := ((3*LN)^2) / ( ((3*LN)^2) + 3*EN)
"
model04SyntaxSBzeroCorEstimates = sem(model = model04SyntaxSBzeroCor, data = hfsSituations, estimator = "MLR", mimic = "mplus", std.lv = TRUE)
sumscoresP = apply(X = hfsSituations[c("Sit2", "Sit4", "Sit6")], MARGIN = 1, FUN = sum)
sumscoresN = apply(X = hfsSituations[c("Sit1r", "Sit3r", "Sit5r")], MARGIN = 1, FUN = sum)
fscoresPI = factorScores(lavObject = model04SyntaxSBzeroCorEstimates)
par(mfrow = c(1,2))
plot(x = fscores$scores$SitP, y = sumscoresP, xlab = "SitP: Full CFA Factor Scores", ylab = "SitP: Sum Score", xlim = c(-3.5, 2))
text(x = -2, y = 20, labels = paste("r =", as.character(round(x = cor(fscores$scores$SitP, sumscoresP), digits = 3))))
plot(x = fscores$scores$SitN, y = sumscoresN, xlab = "SitN: Full CFA Factor Scores", ylab = "SitN: Sum Score", xlim = c(-3.5, 2))
text(x = -2, y = 20, labels = paste("r =", as.character(round(x = cor(fscores$scores$SitN, sumscoresN), digits = 3))))
plot(x = fscoresPI$scores$SitP, y = sumscoresP, xlab = "SitP: Full CFA Factor Scores", ylab = "SitP: Sum Score", xlim=c(-3.5, 2))
text(x = -2, y = 20, labels = paste("r =", as.character(round(x = cor(fscoresPI$scores$SitP, sumscoresP), digits = 3))))
plot(x = fscoresPI$scores$SitN, y = sumscoresN, xlab = "SitN: Full CFA Factor Scores", ylab = "SitN: Sum Score", xlim=c(-3.5, 2))
text(x = -2, y = 20, labels = paste("r =", as.character(round(x = cor(fscoresPI$scores$SitN, sumscoresN), digits = 3))))
par(mfrow = c(1,1))
We can also see how reasonable our choice of a continuous distribution was for these Likert data by plotting the predicted item scores for each item across a the range of observed factor scores. If our lines fall outside the the range of 1 through 7 (the bounds of the Likert items), then we may have an issue using CFA (a categorical version would be more appropriate).
lavObject = model04Estimates
cfaPlots = function(lavObject){
output = inspect(object = lavObject, what = "est")
#build matrix plot elements
#get factor scores
fscores = factorScores(lavObject = lavObject)
nfactors = ncol(fscores$scores)/2
#get max observed data
itemMax = max(apply(X = lavObject@Data@X[[1]], MARGIN = 2, FUN = max))
itemMin = min(apply(X = lavObject@Data@X[[1]], MARGIN = 2, FUN = min))
#get range for all scores
factorMax = max(apply(X = fscores$scores[seq(1, ncol(fscores$scores), 2)], MARGIN = 2, FUN = max))
factorMin = min(apply(X = fscores$scores[seq(1, ncol(fscores$scores), 2)], MARGIN = 2, FUN = min))
#set up x values
x = seq(factorMin, factorMax, .01)
par(mfrow = c(1, nfactors))
#make plots by factor
factor=1
for (factor in 1:nfactors){
xmat = NULL
ymat = NULL
inames = NULL
for (item in 1:nrow(output$lambda)){
if (output$lambda[item, factor] != 0){
inames = c(inames, rownames(output$lambda)[item])
y = output$nu[item] + output$lambda[item, factor]*x
xmat = cbind(xmat, x)
ymat = cbind(ymat, y)
}
}
matplot(x = xmat, y = ymat, type = "l", lwd = 5, lty=2:(ncol(xmat)+1), ylim = c(itemMin-1, itemMax+1), xlim = c(factorMin, factorMax),
ylab = "Predicted Item Response", xlab = colnames(output$lambda)[factor], col = 2:(ncol(xmat)+1))
lines(x = c(factorMin,factorMax), y = c(itemMin, itemMin), lty = 3, lwd = 5)
lines(x = c(factorMin,factorMax), y = c(itemMax, itemMax), lty = 3, lwd = 5)
legend(x = -3, y = 7, legend = inames, lty = 2:(ncol(xmat)+1), lwd = 5, col = 2:(ncol(xmat)+1))
}
par(mfrow = c(1,1))
}
par(mfrow = c(1,2))
cfaPlots(lavObject = model04Estimates)
What if we had just taken the sum (or, here, the mean) of the three items for each subscale?
PosMean = apply(X = hfsSituations[c("Sit2", "Sit4", "Sit6")], MARGIN = 1, FUN = mean)
NegMean = apply(X = hfsSituations[c("Sit1r", "Sit3r", "Sit5r")], MARGIN = 1, FUN = mean)
par(mfrow = c(1,2))
plot(y = PosMean, x = fscores$scores$SitP, xlim = c(-3.5,2), ylim = c(1,7), xlab = "SitP Factor Score", ylab = "Postive Items Mean")
text(x = -3, y = 6, labels = paste("r = ", as.character(round(cor(PosMean, fscores$scores$SitP), digits = 3))))
plot(y = NegMean, x = fscores$scores$SitN, xlim = c(-3.5,2), ylim = c(1,7), xlab = "SitN Factor Score", ylab = "Negative Items Mean")
text(x = -3, y = 6, labels = paste("r = ", as.character(round(cor(NegMean, fscores$scores$SitN), digits = 3))))
par(mfrow = c(1,1))
We can also plot how the two types of scores look together:
par(mfrow = c(1,2))
plot(x = PosMean, y = NegMean, xlim = c(1,8), ylim = c(1,8), xlab = "Postive Items Mean", ylab = "Negative Items Mean")
text(x = 1.85, y = 8, labels = paste("r = ", as.character(round(cor(PosMean, NegMean), digits = 3))))
plot(x = fscores$scores$SitP, y = fscores$scores$SitN, xlim = c(-3.5,2), ylim = c(-3.5,2), xlab = "SitP Factor Score", ylab = "SitN Factor Score")
text(x = -2.5, y = 1.9, labels = paste("r = ", as.character(round(cor(fscores$scores$SitP, fscores$scores$SitN), digits = 3))))
par(mfrow = c(1,1))
There are problems with either of these observed variable approaches: The mean of the items appears to have less variability (i.e., fewer possible scores) and assumes that all items should be weighted equally and have no error. The estimated factor scores do not have the same properties as estimated for the factor in the model (i.e., less variance for each factor, higher correlation among the factors). Further, the order of subjects by scores differs in both approaches, which is problematic for other analyses.
What to do instead of either of these? Stay tuned for how to use plausible values, Bayesian methods, or SEM.
R code to build objects reported in results (don’t show this typically)
#number of observations
format(x = model04Estimates@Data@nobs[[1]][1], big.mark = ",")
[1] "1,103"
#low and high standardized loadings for one factor model
stdLoadings = inspect(object = model01Estimates, what = "std.all")
stdLoadingsLow = round(min(apply(X = stdLoadings$lambda, MARGIN = 2, FUN = function(x) min(x[which(x !=0)]))), digits = 3)
stdLoadingsHigh = round(max(apply(X = stdLoadings$lambda, MARGIN = 2, FUN = function(x) max(x[which(x !=0)]))), digits = 3)
# estimated correlation between factors in two-factor model
stdCov04 = inspect(object = model04Estimates, what = "std.all")$psi
estCorr = format(x = stdCov04[2,1], digits = 3)
#standardized loadings positive
stdLoadings04P = inspect(object = model04Estimates, what = "std.all")$lambda[,1]
minSitP = round(min(stdLoadings04P[which(stdLoadings04P != 0)]), digits = 3)
maxSitP = round(max(stdLoadings04P[which(stdLoadings04P != 0)]), digits = 3)
#standardized loadings negative
stdLoadings04N = inspect(object = model04Estimates, what = "std.all")$lambda[,2]
minSitN = round(min(stdLoadings04N[which(stdLoadings04N != 0)]), digits = 3)
maxSitN = round(max(stdLoadings04N[which(stdLoadings04N != 0)]), digits = 3)
factorRel = factorScoreReliability(lavObject = model04Estimates)
(Note: You may borrow the phrasing contained in this example to describe various aspects of your analyses, but your own results sections will not mimic this example exactly—they should be customized to describe the how and the why of what you did, specifically).
(Descriptive information for the sample and items would have already been given in the method section…)
The reliability and dimensionality of six items each assessing forgiveness of situations was assessed in a sample of 1,103 persons with a confirmatory factor analysis using robust maximum likelihood estimation (MLR) in the lavaan package (Rosseel, 2012) in R (R Core Team, 2017). All models were identified by setting any latent factor means to 0 and latent factor variances to 1, such that all item intercepts, item factor loadings, and item residual variances were then estimated. The six items utilized a seven-point response scale, and three items were reverse-coded prior to analysis such that higher values then indicated greater levels of forgiveness of situations for all items. Model fit statistics reported in Table 1 include the obtained model \(\chi^2\), its scaling factor (in which values different than 1.000 indicate deviations from normality), its degrees of freedom, and its p-value (in which non-significance is desirable for good fit), CFI, or Comparative Fit Index (in which values higher than .95 are desirable for good fit), and the RMSEA, or Root Mean Square Error of Approximation, point estimate and 90% confidence interval (in which values lower than .06 are desirable for good fit). As reported in Table 2, nested model comparisons were conducted using the rescaled \(−2\Delta LL\) with degrees of freedom equal to the rescaled difference in the number of parameters between models (i.e., a rescaled likelihood ratio test). The specific models examined are described in detail below.
Although a one-factor model was initially posited to account for the pattern of covariance across these six items, it resulted in poor fit, as shown in Table 1. Although each item had a significant factor loading (with standardized loadings ranging from 0.509 to 0.778), a single latent factor did not adequately describe the pattern of relationship across these six items as initially hypothesized. Sources of local misfit were identified using the normalized residual covariance matrix, available via the resid() function in lavaan, in which individual values were calculated as: (observed covariance – expected covariance) / SD(observed covariance). Relatively large positive residual covariances were observed among items 2, 4, and 6 (the positively-worded items), indicating that these items were more related than was predicted by the single-factor model. Modification indices, available via the modificationindices() function in lavaan, corroborated this pattern, further suggesting additional remaining relationships among the negatively-worded items as well.
The necessity of separate latent factors for the positively-worded and negatively-worded items was tested by specifying a two-factor model in which the positively-worded items 2, 4, and 6 indicated a forgiveness factor, and in which negatively-worded items 1, 3, and 5 indicated a not unforgiveness factor, and in which the two factors were allowed to correlate. The two-factor model fit was acceptable by every criterion except the significant \(\chi^2\), likely due to the large sample. In addition, the two-factor model fit significantly better than the one-factor model, as reported in Table 2, indicating that the estimated correlation between the two factors of 0.564 was significantly less than 1.000. Thus, the six items appeared to measure two separate but related constructs. Further examination of local fit via normalized residual covariances and modification indices yielded no interpretable remaining relationships, and thus this two-factor model was retained.
Table 3 provides the estimates and their standard errors for the item factor loadings, intercepts, and residual variances from both the unstandardized and standardized solutions. All factor loadings and the factor covariance were statistically significant. As shown in Table 3, standardized loadings for the forgiveness factor items ranged from 0.625 to 0.759 (with \(R^2\) values for the amount of item variance accounted for by the factor ranging from 0.391 to 0.759), and standardized loadings for the not unforgiveness factor ranged from 0.588 to 0.846 (with R2 values of 0.346 to 0.716), suggesting the factor loadings were practically significant as well. Factor score reliaiblity was 1 for the forgiveness factor and 1 for the not unforgiveness factor, suggesting marginal reliability for both of the three-item scales.
The resulting distribution of the EAP estimates of factor score as shown in Figure 1. Figure 2 shows the predicted response for each item as a linear function of the latent factor based on the estimated model parameters. As shown, the predicted item response goes above the highest response option just before a latent factor score of 2 (i.e., 2 SDs above the mean), resulting in a ceiling effect for both sets of factor scores, as also shown in Figure 1. In addition, for the not unforgiveness factor, the predicted item response goes below the lowest response option just before a latent factor score of -3 (i.e., 3 SDs below the mean), resulting in a floor effect for the not unforgiveness factor, as also shown in Figure 1.
The extent to which the items within each factor could be seen as exchangeable was then examined via an additional set of nested model comparisons, as reported in Table 1 (for fit) and Table 2 (for comparisons of fit). First, the assumption of tau-equivalence (i.e., true-score equivalence, equal discrimination across items) was examined by constraining the factor loadings to be equal within a factor. For the not unforgiveness factor, the tau-equivalent model fit was acceptable but was significantly worse than the original two-factor model fit (i.e., in which all loadings were estimated freely). For the forgiveness factor, however, the tau-equivalent model fit was acceptable and was not significantly worse than the original two-factor model fit. Thus, the assumption of tau-equivalence held for the forgiveness factor items only. Finally, the assumption of parallel items (i.e., equal factor loadings and equal residual variances, or equal reliability across items) was examined for the forgiveness factor items only, and the resulting model fit was acceptable but was significantly worse than the tau-equivalent forgiveness factor model fit. Thus, the assumption of parallel items did not hold for the forgiveness factor items. In summary, while the not unforgiveness factor items were not exchangeable, the forgiveness factor items were exchangeable with respect to their factor loadings only (i.e., equal discrimination, but not equal residual variances or reliability).
R Core Team (2017). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/.
Rosseel, Y (2012). lavaan: An R Package for Structural Equation Modeling. Journal of Statistical Software, 48(2), 1-36. URL http://www.jstatsoft.org/v48/i02/.
citation()
To cite R in publications use:
R Core Team (2017). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna,
Austria. URL https://www.R-project.org/.
A BibTeX entry for LaTeX users is
@Manual{,
title = {R: A Language and Environment for Statistical Computing},
author = {{R Core Team}},
organization = {R Foundation for Statistical Computing},
address = {Vienna, Austria},
year = {2017},
url = {https://www.R-project.org/},
}
We have invested a lot of time and effort in creating R, please cite it when using it for data analysis. See also
‘citation("pkgname")’ for citing R packages.
citation("lavaan")
To cite lavaan in publications use:
Yves Rosseel (2012). lavaan: An R Package for Structural Equation Modeling. Journal of Statistical Software, 48(2), 1-36. URL
http://www.jstatsoft.org/v48/i02/.
A BibTeX entry for LaTeX users is
@Article{,
title = {{lavaan}: An {R} Package for Structural Equation Modeling},
author = {Yves Rosseel},
journal = {Journal of Statistical Software},
year = {2012},
volume = {48},
number = {2},
pages = {1--36},
url = {http://www.jstatsoft.org/v48/i02/},
}
fitstats01 = fitmeasures(object = model01Estimates)
fitstats04 = fitmeasures(object = model04Estimates)
fitstats04A = fitmeasures(object = model04EstimatesAlpha)
table1 = cbind(ncol(model01Estimates@SampleStats@cov[[1]]), fitstats01[1], fitstats01[6], fitstats01[9], fitstats01[7], fitstats01[8], fitstats01[25],
fitstats01[48], fitstats01[49], fitstats01[50], fitstats01[51])
colnames(table1) = c("# Items", "# Parameters", "Scaled Chi-Square", "Chi-Square Scale Factor", "DF", "p-value", "CFI", "RMSEA", "RMSEA Lower",
"RMSEA Upper", "RMSEA p-value")
rownames(table1) = "One-Factor"
table1 = rbind(table1, cbind(ncol(model04Estimates@SampleStats@cov[[1]]), fitstats04[1], fitstats04[6], fitstats04[9], fitstats04[7], fitstats04[8],
fitstats04[25], fitstats04[48], fitstats04[49], fitstats04[50], fitstats04[51]))
rownames(table1)[2] = "Two-Factor"
table1 = rbind(table1, cbind(ncol(model04Estimates@SampleStats@cov[[1]]), fitstats04A[1], fitstats04A[6], fitstats04A[9], fitstats04A[7], fitstats04A[8],
fitstats04A[25], fitstats04A[48], fitstats04A[49], fitstats04A[50], fitstats04A[51]))
rownames(table1)[3] = "Two-Factor Tau-Equivalent"
kable(table1, format = "html", digits = 3) %>% kable_styling("striped")
# Items | # Parameters | Scaled Chi-Square | Chi-Square Scale Factor | DF | p-value | CFI | RMSEA | RMSEA Lower | RMSEA Upper | RMSEA p-value | |
---|---|---|---|---|---|---|---|---|---|---|---|
One-Factor | 6 | 18 | 307.804 | 1.390 | 9 | 0.000 | 0.732 | 0.173 | 0.160 | 0.188 | 0.000 |
Two-Factor | 6 | 19 | 24.924 | 1.421 | 8 | 0.002 | 0.985 | 0.044 | 0.028 | 0.061 | 0.708 |
Two-Factor Tau-Equivalent | 6 | 15 | 55.487 | 1.325 | 12 | 0.000 | 0.961 | 0.057 | 0.044 | 0.071 | 0.167 |
table2 = anova(modelSaturatedEstimates, model01Estimates)[2,-c(2:4)]
rownames(table2) = "One-Factor vs. Saturated"
table2 = rbind(table2,anova(model01Estimates, model04Estimates)[2,-c(2:4)])
rownames(table2)[2] = "One-Factor vs. Two-Factor"
table2 = rbind(table2,anova(modelSaturatedEstimates, model04Estimates)[2,-c(2:4)])
rownames(table2)[3] = "Two-Factor vs. Saturated"
table2 = rbind(table2,anova(model04EstimatesAlpha, model04Estimates)[2,-c(2:4)])
rownames(table2)[4] = "Two-Factor vs. Two-Factor Tau-Equivalent"
kable(table2, format = "html", digits = 3) %>% kable_styling("striped")
Df | Chisq diff | Df diff | Pr(>Chisq) | |
---|---|---|---|---|
One-Factor vs. Saturated | 9 | 307.804 | 9 | 0.000 |
One-Factor vs. Two-Factor | 9 | 342.292 | 1 | 0.000 |
Two-Factor vs. Saturated | 8 | 24.924 | 8 | 0.002 |
Two-Factor vs. Two-Factor Tau-Equivalent | 12 | 33.635 | 4 | 0.000 |
unstandardizedLoadings = inspect(object = model04Estimates, what = "est")
unstandardizedSE = inspect(object = model04Estimates, what = "se")
standardizedLoadings = inspect(object = model04Estimates, what = "std.all")
table3 = cbind(unstandardizedLoadings$lambda[1:3,1], unstandardizedSE$lambda[1:3,1], standardizedLoadings$lambda[1:3,1])
colnames(table3) = c("Estimate", "SE", "Estimate")
rownames(table3) = c("Item 2", "Item 4", "Item 6")
table3 = rbind(table3, cbind(cbind(unstandardizedLoadings$lambda[4:6,2]), cbind(unstandardizedSE$lambda[4:6,2]), cbind(standardizedLoadings$lambda[4:6,2])))
rownames(table3)[4:6] = c("Item 1", "Item 3", "Item 5")
table3 = rbind(table3, c(unstandardizedLoadings$psi[2,1], unstandardizedSE$psi[2,1], standardizedLoadings$psi[2,1]))
rownames(table3)[7] = "Factor Covariance"
interceptsOrder = c(4,1,5,2,6,3)
table3 = rbind(table3, cbind(cbind(unstandardizedLoadings$nu[1:6,1][interceptsOrder]), cbind(unstandardizedSE$nu[1:6,1][interceptsOrder]),
cbind(standardizedLoadings$nu[1:6,1][interceptsOrder])))
rownames(table3)[8:13] = paste("Item", 1:6)
table3 = rbind(table3, cbind(cbind(diag(unstandardizedLoadings$theta)[interceptsOrder]), cbind(diag(unstandardizedSE$theta)[interceptsOrder]),
cbind(diag(standardizedLoadings$theta)[interceptsOrder])))
rownames(table3)[14:19] = paste("Item", 1:6)
kable(table3, format = "html", digits = 3) %>% kable_styling("striped") %>% add_header_above(c(" " = 1, "Unstandardized" = 2, "Standardized" = 1)) %>% group_rows("Forgiveness Factor Loadings", 1,3) %>% group_rows("Not Unforgiveness Factor Loadings", 4,6) %>%
group_rows("Factor Covariance", 7,7) %>% group_rows("Item Intercepts", 8,13) %>% group_rows("Item Unique Variances", 14, 19)
Estimate | SE | Estimate | |
---|---|---|---|
Forgiveness Factor Loadings | |||
Item 2 | 1.007 | 0.052 | 0.730 |
Item 4 | 1.064 | 0.050 | 0.759 |
Item 6 | 0.956 | 0.053 | 0.625 |
Not Unforgiveness Factor Loadings | |||
Item 1 | 1.325 | 0.048 | 0.759 |
Item 3 | 1.349 | 0.044 | 0.846 |
Item 5 | 1.009 | 0.055 | 0.588 |
Factor Covariance | |||
Factor Covariance | 0.564 | 0.041 | 0.564 |
Item Intercepts | |||
Item 1 | 4.547 | 0.053 | 2.604 |
Item 2 | 5.289 | 0.042 | 3.834 |
Item 3 | 4.896 | 0.048 | 3.070 |
Item 4 | 5.359 | 0.042 | 3.821 |
Item 5 | 4.860 | 0.052 | 2.832 |
Item 6 | 5.321 | 0.046 | 3.477 |
Item Unique Variances | |||
Item 1 | 1.294 | 0.103 | 0.425 |
Item 2 | 0.888 | 0.097 | 0.467 |
Item 3 | 0.724 | 0.092 | 0.285 |
Item 4 | 0.835 | 0.093 | 0.425 |
Item 5 | 1.926 | 0.119 | 0.654 |
Item 6 | 1.428 | 0.134 | 0.610 |
modelSaturated = "
# Item intercepts --> ~ 1 indicates means or intercepts
# You can put multiple lines of syntax on a single line using a ;
Sit1r ~ 1; Sit2 ~ 1; Sit3r ~ 1; Sit4 ~ 1; Sit5r ~ 1; Sit6 ~ 1;
# Item variances and covariances --> use the ~~ command
# Variances:
Sit1r ~~ Sit1r; Sit2 ~~ Sit2; Sit3r ~~ Sit3r; Sit4 ~~ Sit4; Sit5r ~~ Sit5r; Sit6 ~~ Sit6;
# Covariances:
Sit1r ~~ Sit2; Sit1r ~~ Sit3r; Sit1r ~~ Sit4; Sit1r ~~ Sit5r; Sit1r ~~ Sit6;
Sit2 ~~ Sit3r; Sit2 ~~ Sit4; Sit2 ~~ Sit5r; Sit2 ~~ Sit6;
Sit3r ~~ Sit4; Sit3r ~~ Sit5r; Sit3r ~~ Sit6;
Sit4 ~~ Sit5r; Sit4 ~~ Sit6;
Sit5r ~~ Sit6;
"
modelSaturatedEstimates = sem(model = modelSaturated, data = hfsSituations, estimator = "MLR", mimic = "mplus")
summary(modelSaturatedEstimates, fit.measures = TRUE)
lavaan (0.5-23.1097) converged normally after 46 iterations
Number of observations 1103
Number of missing patterns 1
Estimator ML Robust
Minimum Function Test Statistic 0.000 0.000
Degrees of freedom 0 0
Minimum Function Value 0.0000000000000
Scaling correction factor NA
for the Yuan-Bentler correction (Mplus variant)
Model test baseline model:
Minimum Function Test Statistic 1981.034 1128.693
Degrees of freedom 15 15
P-value 0.000 0.000
User model versus baseline model:
Comparative Fit Index (CFI) 1.000 1.000
Tucker-Lewis Index (TLI) 1.000 1.000
Robust Comparative Fit Index (CFI) NA
Robust Tucker-Lewis Index (TLI) NA
Loglikelihood and Information Criteria:
Loglikelihood user model (H0) -11322.435 -11322.435
Loglikelihood unrestricted model (H1) -11322.435 -11322.435
Number of free parameters 27 27
Akaike (AIC) 22698.870 22698.870
Bayesian (BIC) 22834.027 22834.027
Sample-size adjusted Bayesian (BIC) 22748.268 22748.268
Root Mean Square Error of Approximation:
RMSEA 0.000 0.000
90 Percent Confidence Interval 0.000 0.000 0.000 0.000
P-value RMSEA <= 0.05 NA NA
Robust RMSEA 0.000
90 Percent Confidence Interval 0.000 0.000
Standardized Root Mean Square Residual:
SRMR 0.000 0.000
Parameter Estimates:
Information Observed
Standard Errors Robust.huber.white
Covariances:
Estimate Std.Err z-value P(>|z|)
Sit1r ~~
Sit2 0.577 0.083 6.974 0.000
Sit3r 1.802 0.091 19.800 0.000
Sit4 0.734 0.080 9.170 0.000
Sit5r 1.358 0.100 13.638 0.000
Sit6 0.795 0.092 8.620 0.000
Sit2 ~~
Sit3r 0.697 0.077 9.023 0.000
Sit4 1.103 0.084 13.112 0.000
Sit5r 0.604 0.080 7.578 0.000
Sit6 0.965 0.083 11.687 0.000
Sit3r ~~
Sit4 0.824 0.076 10.848 0.000
Sit5r 1.319 0.091 14.495 0.000
Sit6 0.868 0.085 10.261 0.000
Sit4 ~~
Sit5r 0.695 0.079 8.843 0.000
Sit6 0.962 0.081 11.821 0.000
Sit5r ~~
Sit6 0.798 0.089 8.955 0.000
Intercepts:
Estimate Std.Err z-value P(>|z|)
Sit1r 4.547 0.053 86.474 0.000
Sit2 5.289 0.042 127.346 0.000
Sit3r 4.896 0.048 101.959 0.000
Sit4 5.359 0.042 126.896 0.000
Sit5r 4.860 0.052 94.060 0.000
Sit6 5.321 0.046 115.492 0.000
Variances:
Estimate Std.Err z-value P(>|z|)
Sit1r 3.049 0.098 31.216 0.000
Sit2 1.903 0.093 20.547 0.000
Sit3r 2.543 0.092 27.763 0.000
Sit4 1.967 0.095 20.683 0.000
Sit5r 2.945 0.100 29.595 0.000
Sit6 2.341 0.108 21.771 0.000
par(mfrow = c(1,2))
cfaPlots(lavObject = model04Estimates)
par(mfrow = c(1,1))