The Forgiveness of Situations Subscale includes six items, three of which are reverse-coded, on a seven-point scale:

- When things go wrong for reasons that can’t be controlled, I get stuck in negative thoughts about it. (R)
- With time I can be understanding of bad circumstances in my life.
- If I am disappointed by uncontrollable circumstances in my life, I continue to think negatively about them. (R)
- I eventually make peace with bad situations in my life.
- It’s really hard for me to accept negative situations that aren’t anybody’s fault. (R)
- Eventually I let go of negative thoughts about bad circumstances that are beyond anyone’s control.

Response Anchors:

- 1 = Almost Always False of Me
- 2 = ?
- 3 = More Often False of Me
- 4 = ?
- 5 = More Often True of Me
- 6 = ?
- 7 = Almost Always True of Me

*Note: the data for this example are simulated based on the results of the real data analysis. These results will be different than those reported in the other example file but are used to show you how to execute the syntax in this analysis.*

For this example, we will be using the `ggplot2`

(for general plotting), `melt2`

(for data reshaping before plotting), and `lavaan`

(for CFA) packages.

```
if (require(ggplot2)==FALSE){
install.packages("ggplot2")
}
library(ggplot2)
if (require(reshape2) == FALSE){
install.packages("reshape2")
}
if (require(lavaan) == FALSE){
install.packages("lavaan")
}
library(lavaan)
if (require(psych) == FALSE){
install.packages("psych")
}
if (require(knitr) == FALSE){
install.packages("knitr")
}
if (require(kableExtra) == FALSE){
install.packages("kableExtra")
}
if (require(semPlot) == FALSE){
install.packages("semPlot")
}
```

The data are in a text file named `Study2.dat`

originally used in Mplus (so no column names were included at the top of the file). The file contains more items than we will use, so we select only the “Forgiveness of Situations” items from the whole file.

```
#syntax from previous example: commented to remove for simulation
# #read in data file (Mplus file format so having to label columns)
# hfsData = read.table(file = "Study2.dat", header = FALSE, na.strings = "99999", col.names = c("PersonID", "Self1", "Self2r", "Self3", "Self4r", "Self5", "Self6r", "Other1r", "Other2", "Other3r", "Other4", "Other5r", "Other6", "Sit1r", "Sit2", "Sit3r", "Sit4", "Sit5r", "Sit6", "Selfsub", "Othsub", "Sitsub", "HFSsum"))
#
# #select Situations items and PersonID variables
# hfsSituations = hfsData[c("PersonID", "Sit1r", "Sit2", "Sit3r", "Sit4", "Sit5r", "Sit6")]
#setting seed for output constancy across machines
set.seed(04092017)
# simulating data for analysis based on results of model04estimates
model04Simulate = "
# SitP loadings (all estimated)
SitP =~ 1.007*Sit2 + 1.064*Sit4 + 0.956*Sit6
# SitN loadings (all estimated)
SitN =~ 1.325*Sit1r + 1.349*Sit3r + 1.009*Sit5r
# Unique Variances:
Sit1r ~~ 1.294*Sit1r; Sit2 ~~ 0.888*Sit2; Sit3r ~~ 0.724*Sit3r; Sit4 ~~ 0.835*Sit4; Sit5r ~~ 1.926*Sit5r; Sit6 ~~ 1.428*Sit6;
# Item Intercepts:
Sit2 ~ 5.289*1
Sit4 ~ 5.359*1
Sit6 ~ 5.321*1
Sit1r ~ 4.547*1
Sit3r ~ 4.896*1
Sit5r ~ 4.860*1
# Factor Covariances
SitP ~~ .564*SitN
"
#generate data
hfsSituations = simulateData(model = model04Simulate, sample.nobs = 1103L, model.type = "sem")
#add ID variable
hfsSituations = data.frame(PersonID = 1:1103, hfsSituations)
#reorder variables to match data file
hfsSituations = hfsSituations[c("PersonID", "Sit1r", "Sit2", "Sit3r", "Sit4", "Sit5r", "Sit6")]
```

The observed correlation matrix, rounded to three digits:

```
#here the c() function selects only the variables, not the PersionID variable
round(cor(hfsSituations[c("Sit1r", "Sit2", "Sit3r", "Sit4", "Sit5r", "Sit6")]), digits = 3)
```

```
Sit1r Sit2 Sit3r Sit4 Sit5r Sit6
Sit1r 1.000 0.360 0.658 0.348 0.450 0.284
Sit2 0.360 1.000 0.412 0.588 0.257 0.464
Sit3r 0.658 0.412 1.000 0.400 0.497 0.315
Sit4 0.348 0.588 0.400 1.000 0.277 0.440
Sit5r 0.450 0.257 0.497 0.277 1.000 0.193
Sit6 0.284 0.464 0.315 0.440 0.193 1.000
```

The observed means, rounded to three digits:

`apply(X = hfsSituations[c("Sit1r", "Sit2", "Sit3r", "Sit4", "Sit5r", "Sit6")], MARGIN = 2, FUN = function(x) round(mean(x), digits = 3))`

```
Sit1r Sit2 Sit3r Sit4 Sit5r Sit6
4.443 5.287 4.778 5.314 4.761 5.263
```

The observed variances (using \(N\) in the denominator to match ML estimated output and Mplus example), rounded to three digits:

`apply(X = hfsSituations[c("Sit1r", "Sit2", "Sit3r", "Sit4", "Sit5r", "Sit6")], MARGIN = 2, FUN = function(x) round(var(x, na.rm = TRUE)*1102/1103, digits = 3))`

```
Sit1r Sit2 Sit3r Sit4 Sit5r Sit6
3.209 2.007 2.616 1.976 2.979 2.281
```

To do a CFA analysis, you only really need means, variances, and either correlations or covariances among items. That said, modern methods of estimation use the raw data (often called full information) rather than the summary statistics as the raw data enable better missing data assumptions when using maximum likelihood and Bayesian estimation methods.

The sample covariance matrix can be found from the sample correlations and variances. Each covariance between a pair of variables \(y_1\) and \(y_2\) is denoted with a \(\sigma_{y_1, y_2}\) and each correlation is denoted with a \(\rho_{y_1, y_2}\). The variance of a variable is denoted by \(\sigma^2_{y_1}\) and the standard deviation of a variable is the square root of the variance \(\sqrt{\sigma^2_{y_1}}\). The covariance can be found by taking the correlation and multiplying it by the product of the standard deviations.

\[\sigma_{y_1, y_2} = \rho_{y_1, y_2}\sqrt{\sigma^2_{y_1}}\sqrt{\sigma^2_{y_2}}. \]

Inversely, the correlation can be found by taking the covariance and dividing it by the product of the standard deviations:

\[\rho_{y_1, y_2} = \frac{\sigma_{y_1, y_2}}{\sqrt{\sigma^2_{y_1}}\sqrt{\sigma^2_{y_2}}}. \] Again, we change the denominator from \(N-1\) to \(N\) to be consistent with the Mplus example, which calculates covariances using maximum likelihood.

`round(cov(hfsSituations[c("Sit1r", "Sit2", "Sit3r", "Sit4", "Sit5r", "Sit6")])*1102/1103, digits = 3)`

```
Sit1r Sit2 Sit3r Sit4 Sit5r Sit6
Sit1r 3.209 0.914 1.905 0.876 1.391 0.768
Sit2 0.914 2.007 0.945 1.170 0.629 0.994
Sit3r 1.905 0.945 2.616 0.909 1.386 0.769
Sit4 0.876 1.170 0.909 1.976 0.673 0.934
Sit5r 1.391 0.629 1.386 0.673 2.979 0.502
Sit6 0.768 0.994 0.769 0.934 0.502 2.281
```

The assumptions of CFA (i.e., normally distributed factors, no item-level factor interactions, and conditionally normal distributed items) lead to the overall assumption that our item responses must be normally distributed. From the histograms below, do you think these are normally distributed?

```
#stack data
melted = melt(hfsSituations, id.vars = "PersonID")
#plot by variable
ggplot(melted, aes(value)) + geom_density() + facet_wrap(~ variable)
```

`lavaan`

syntax is constructed using a long text string and saved in a character object. In the example below `model01SyntaxLong = "`

opens the text string, which continues for multiple lines until the final `"`

terminates the string. The variable model01Syntax now contains the text of the `lavaan`

syntax. Within the syntax string, the R comment character `#`

still functions to comment text around the syntax. Each part of the model syntax corresponds to a set of parameters in the CFA model. You can find information about lavaan at http://lavaan.ugent.be.

```
model01SyntaxLong = "
# Model 1 -- Fully Z-Scored Factor Identification Approach
# Item factor loadings --> list the factor to the left of the =~ and the items to the right, separated by a plus
# Once the factor is defined it can be used in syntax as if it is an observed variable
# Parameters can be fixed to constant values by using the * command; starting values can be specified by using start()*; labels can be implemented with []
Sit =~ Sit1r + Sit2 + Sit3r + Sit4 + Sit5r + Sit6
# Item intercepts --> ~ 1 indicates means or intercepts
# You can put multiple lines of syntax on a single line using a ;
Sit1r ~ 1; Sit2 ~ 1; Sit3r ~ 1; Sit4 ~ 1; Sit5r ~ 1; Sit6 ~ 1;
# Item error (unique) variances and covariances --> use the ~~ command
Sit1r ~~ Sit1r; Sit2 ~~ Sit2; Sit3r ~~ Sit3r; Sit4 ~~ Sit4; Sit5r ~~ Sit5r; Sit6 ~~ Sit6;
# Factor variance
Sit ~~ 100*Sit
# Factor mean (intercept)
Sit ~ 0
"
```

To run lavaan, the syntax string variable is passed to the `lavaan(model = model01SyntaxLong, ...)`

function. In the function call, we also supply:

`data = hfsSituations`

: The data frame which contains the variables in the syntax.`estimator = "MLR"`

: Enables the use of robust maximum likelihood estimation, which helps for data that are not entirely normal.`mimic = "mplus"`

: Ensures compatibility with Mplus output, which is often needed in practice (and is certainly needed for our homework system).`std.lv = TRUE`

: Uses the Z-score method of identification, setting all latent variable means to zero, all latent variable variances to one, and estimating all factor loadings.

`model01Estimates = lavaan(model = model01SyntaxLong, data = hfsSituations, estimator = "MLR", mimic = "mplus", std.lv = FALSE)`

If the model estimation was successful, no errors would be displayed (and no output would be shown). All model information and results are contained within the `model01Estimates`

object. To access a summary of the results use the `summary()`

function with the following arguments:

`model01Estimates`

: The analysis to be summarized.`fit.measures=TRUE`

: Provides model fit indices in summary.`rsqaure=TRUE`

: Provides the proportion of variance “explained” for each variable (\(R^2\)).`standardized = TRUE`

: Provides standardized estimates in the summary.

`summary(model01Estimates, fit.measures = TRUE, rsquare = TRUE, standardized = TRUE)`

```
lavaan (0.5-23.1097) converged normally after 39 iterations
Number of observations 1103
Number of missing patterns 1
Estimator ML Robust
Minimum Function Test Statistic 387.333 467.448
Degrees of freedom 9 9
P-value (Chi-square) 0.000 0.000
Scaling correction factor 0.829
for the Yuan-Bentler correction (Mplus variant)
Model test baseline model:
Minimum Function Test Statistic 2076.990 2090.854
Degrees of freedom 15 15
P-value 0.000 0.000
User model versus baseline model:
Comparative Fit Index (CFI) 0.817 0.779
Tucker-Lewis Index (TLI) 0.694 0.632
Robust Comparative Fit Index (CFI) 0.816
Robust Tucker-Lewis Index (TLI) 0.693
Loglikelihood and Information Criteria:
Loglikelihood user model (H0) -11535.690 -11535.690
Scaling correction factor 1.076
for the MLR correction
Loglikelihood unrestricted model (H1) -11342.023 -11342.023
Scaling correction factor 0.994
for the MLR correction
Number of free parameters 18 18
Akaike (AIC) 23107.380 23107.380
Bayesian (BIC) 23197.484 23197.484
Sample-size adjusted Bayesian (BIC) 23140.311 23140.311
Root Mean Square Error of Approximation:
RMSEA 0.195 0.215
90 Percent Confidence Interval 0.179 0.212 0.197 0.233
P-value RMSEA <= 0.05 0.000 0.000
Robust RMSEA 0.196
90 Percent Confidence Interval 0.181 0.211
Standardized Root Mean Square Residual:
SRMR 0.079 0.079
Parameter Estimates:
Information Observed
Standard Errors Robust.huber.white
Latent Variables:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
Sit =~
Sit1r 0.132 0.006 21.977 0.000 1.317 0.735
Sit2 0.084 0.006 14.718 0.000 0.844 0.596
Sit3r 0.129 0.006 23.254 0.000 1.287 0.796
Sit4 0.082 0.006 14.340 0.000 0.824 0.586
Sit5r 0.097 0.006 17.607 0.000 0.970 0.562
Sit6 0.072 0.006 12.137 0.000 0.720 0.477
Intercepts:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.Sit1r 4.443 0.054 82.373 0.000 4.443 2.480
.Sit2 5.287 0.043 123.919 0.000 5.287 3.731
.Sit3r 4.778 0.049 98.116 0.000 4.778 2.954
.Sit4 5.314 0.042 125.550 0.000 5.314 3.780
.Sit5r 4.761 0.052 91.624 0.000 4.761 2.759
.Sit6 5.263 0.045 115.739 0.000 5.263 3.485
Sit 0.000 0.000 0.000
Variances:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.Sit1r 1.475 0.108 13.649 0.000 1.475 0.460
.Sit2 1.294 0.082 15.747 0.000 1.294 0.645
.Sit3r 0.959 0.093 10.279 0.000 0.959 0.367
.Sit4 1.298 0.081 16.072 0.000 1.298 0.657
.Sit5r 2.037 0.108 18.787 0.000 2.037 0.684
.Sit6 1.762 0.088 20.034 0.000 1.762 0.772
Sit 100.000 1.000 1.000
R-Square:
Estimate
Sit1r 0.540
Sit2 0.355
Sit3r 0.633
Sit4 0.343
Sit5r 0.316
Sit6 0.228
```

Each section contains different portions of model information.

```
Latent Variables:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
Sit =~
Sit1r 1.234 0.069 17.906 0.000 1.234 0.707
Sit2 0.702 0.074 9.441 0.000 0.702 0.509
Sit3r 1.241 0.063 19.847 0.000 1.241 0.778
Sit4 0.784 0.069 11.333 0.000 0.784 0.559
Sit5r 1.023 0.053 19.179 0.000 1.023 0.596
Sit6 0.819 0.069 11.942 0.000 0.819 0.535
```

Note: the last term in the list, `Sit`

is the “intercept” (mean) of the factor. As it does not have a standard error, this indicates it was fixed to zero and not estimated.

```
Intercepts:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.Sit1r 4.547 0.053 86.474 0.000 4.547 2.604
.Sit2 5.289 0.042 127.346 0.000 5.289 3.834
.Sit3r 4.896 0.048 101.959 0.000 4.896 3.070
.Sit4 5.359 0.042 126.896 0.000 5.359 3.821
.Sit5r 4.860 0.052 94.060 0.000 4.860 2.832
.Sit6 5.321 0.046 115.492 0.000 5.321 3.477
Sit 0.000 0.000 0.000
```

Note: the last term in the list, `Sit`

, is the variance of the factor. As it does not have a standard error, this indicates it was fixed to one (from the `std.lv = TRUE`

option we used in the `lavaan()`

function call) and not estimated.

```
Variances:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.Sit1r 1.526 0.149 10.217 0.000 1.526 0.500
.Sit2 1.409 0.128 11.014 0.000 1.409 0.741
.Sit3r 1.004 0.135 7.456 0.000 1.004 0.395
.Sit4 1.352 0.127 10.672 0.000 1.352 0.687
.Sit5r 1.899 0.118 16.025 0.000 1.899 0.645
.Sit6 1.671 0.159 10.517 0.000 1.671 0.714
Sit 1.000 1.000 1.000
```

*Writing out the model—individual predicted values:*

- \(Y_1 = \mu_1 + \lambda_1F + e_1 = 4.547 + 1.234F + e_1\)

*Writing out the model—predicted item variances and covariances:*

\(Var\left( Y_1 \right) = \left( \lambda^2_1 \right) Var\left(F\right) + Var\left(e_1\right) = (1.234^2)*(1) + 1.526 = 3.049\) (= original item variance)

\(Cov\left( Y_1, Y_2 \right) = \lambda_1 Var\left(F\right) \lambda_2 = (1.234)*(1)*(.702) = .866\) (actual covariance = .577, so the model over-predicted how related items 1 and 2 should be)

The R package semPlot helps provide a path diagram of a model. Here is an example:

`semPaths(object = model01Estimates, what = "est")`

The last two columns of the summary output, `Std.lv`

and `Std.all`

, contain the standardized model parameter estimates. To understand the difference between standardized and unstandardized parameter estimates, let’s start with the standardized estimates. Standardized regression coefficients (and factor loadings) have a scale that is “units of Y” per “unit of X”. That is, the slope/loading, represents the increase in the dependent variable \(X\) per unit of the independent variable \(X\) (in our case, the factor). The units of \(Y\) are given by the standard deviation of \(Y\), or \(SD(Y)\). Similarly, the units of \(X\) are given by the standard deviation of \(X\), or \(SD(X)\). You can think of the units associated by the fraction \(\frac{SD(Y)}{SD(X)}\). So, the first factor loading (a value of 1.234 for the item Sit1r) indicates the numeric response to the item goes up by 1.234 for every one-unit increase in the factor Sit.

The process of standardization removes the units attached to the parameter. So, if the unstandardized factor loadings are expressed in units of \(\frac{SD(Y)}{SD(X)}\), the standardized units are achieved by either dividing or multiplying by the appropriate standard deviation to make the numerator or denominator of \(\frac{SD(Y)}{SD(X)}\) equal to one. For the standardized estimates under `std.lv`

, the units of the factor \((SD(X))\) are removed (yielding \(\frac{SD(Y)}{1}\)) by multiplying the estimate by the factor standard deviation. Because the factor standard deviation is set to one, all estimates in this column are the same as the unstandardized estimate. These unstandardized estimates can be used to see how parameters would look under the Z-score identification method another identification method was used.

The standardized estimates listed under `std.all`

are formed by multiplying the estimate by the standard deviation of the factor and dividing that by the unconditional (raw) standard deviation of the item. For instance, the “fully” standardized factor loading of the first item is found by multiplying the unstandardized coefficient (1.234) by one (the factor standard deviation) and dividing by the item’s standard deviation (\(\sqrt{3.049}\) – the square root of the item variance shown at the beginning of this example). The resulting value, \(1.234\times\frac{1}{\sqrt{3.049}} = 0.707\), represents the factor loading would the analysis have been run (a) with a Z-score factor identification and (b) on an item that was a Z-score.

The process of standardization is the same for all parameters of the model: intercepts, loadings, and residual (unique) variances. The interpretation of standardized item intercepts is difficult and often these are not reported. The standardized versions of factor loadings and unique variances are commonly reported.

Moreover, in for items measuring one factor, the standardized loadings lead directly to the \(R^2\) estimate – the amount of variance in the item responses explained by the factor. The item \(R^2\) is found by the square of the unstandardized factor loading. The R-Square information is found at the end of the model summary, so long as the option `rsquare = TRUE`

is specified in the `summary()`

function call.

```
R-Square:
Estimate
Sit1r 0.500
Sit2 0.259
Sit3r 0.605
Sit4 0.313
Sit5r 0.355
Sit6 0.286
```

The syntax above is designed to show the mapping of all CFA parameters to all lavaan syntax elements. In reality, all you would need to write to define this model is:

```
model01SyntaxShort = "
Sit =~ Sit1r + Sit2 + Sit3r + Sit4 + Sit5r + Sit6
"
```

The syntax input into the `sem()`

function uses some defaults:

- All item intercepts are estimated (each item gets its own) and the factor mean is fixed at zero.
- All item error (unique) variances are estimated (each item gets its own).
- All factor variances and covariances are estimated (to do so, the first item after the =~ has its factor loading set to 1 – a marker item identification method)

Similarly, to run this syntax the `sem`

function is now called. To see the results, the `summary`

function is used. The `sem`

function simplifies the syntax needed to conduct a confirmatory factor analysis. These are demonstrated below. Note the output is identical to what was run previously.

```
model01EstimatesShort = sem(model = model01SyntaxShort, data = hfsSituations, estimator = "MLR", mimic = "mplus", std.lv = TRUE)
summary(model01EstimatesShort, fit.measures = TRUE, rsquare = TRUE, standardized = TRUE)
```

```
lavaan (0.5-23.1097) converged normally after 20 iterations
Number of observations 1103
Number of missing patterns 1
Estimator ML Robust
Minimum Function Test Statistic 387.333 467.448
Degrees of freedom 9 9
P-value (Chi-square) 0.000 0.000
Scaling correction factor 0.829
for the Yuan-Bentler correction (Mplus variant)
Model test baseline model:
Minimum Function Test Statistic 2076.990 2090.854
Degrees of freedom 15 15
P-value 0.000 0.000
User model versus baseline model:
Comparative Fit Index (CFI) 0.817 0.779
Tucker-Lewis Index (TLI) 0.694 0.632
Robust Comparative Fit Index (CFI) 0.816
Robust Tucker-Lewis Index (TLI) 0.693
Loglikelihood and Information Criteria:
Loglikelihood user model (H0) -11535.690 -11535.690
Scaling correction factor 1.076
for the MLR correction
Loglikelihood unrestricted model (H1) -11342.023 -11342.023
Scaling correction factor 0.994
for the MLR correction
Number of free parameters 18 18
Akaike (AIC) 23107.380 23107.380
Bayesian (BIC) 23197.484 23197.484
Sample-size adjusted Bayesian (BIC) 23140.311 23140.311
Root Mean Square Error of Approximation:
RMSEA 0.195 0.215
90 Percent Confidence Interval 0.179 0.212 0.197 0.233
P-value RMSEA <= 0.05 0.000 0.000
Robust RMSEA 0.196
90 Percent Confidence Interval 0.181 0.211
Standardized Root Mean Square Residual:
SRMR 0.079 0.079
Parameter Estimates:
Information Observed
Standard Errors Robust.huber.white
Latent Variables:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
Sit =~
Sit1r 1.317 0.060 21.977 0.000 1.317 0.735
Sit2 0.844 0.057 14.718 0.000 0.844 0.596
Sit3r 1.287 0.055 23.255 0.000 1.287 0.796
Sit4 0.824 0.057 14.340 0.000 0.824 0.586
Sit5r 0.970 0.055 17.607 0.000 0.970 0.562
Sit6 0.720 0.059 12.137 0.000 0.720 0.477
Intercepts:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.Sit1r 4.443 0.054 82.373 0.000 4.443 2.480
.Sit2 5.287 0.043 123.919 0.000 5.287 3.731
.Sit3r 4.778 0.049 98.116 0.000 4.778 2.954
.Sit4 5.314 0.042 125.550 0.000 5.314 3.780
.Sit5r 4.761 0.052 91.624 0.000 4.761 2.759
.Sit6 5.263 0.045 115.739 0.000 5.263 3.485
Sit 0.000 0.000 0.000
Variances:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.Sit1r 1.475 0.108 13.649 0.000 1.475 0.460
.Sit2 1.294 0.082 15.747 0.000 1.294 0.645
.Sit3r 0.959 0.093 10.279 0.000 0.959 0.367
.Sit4 1.298 0.081 16.072 0.000 1.298 0.657
.Sit5r 2.037 0.108 18.787 0.000 2.037 0.684
.Sit6 1.762 0.088 20.034 0.000 1.762 0.772
Sit 1.000 1.000 1.000
R-Square:
Estimate
Sit1r 0.540
Sit2 0.355
Sit3r 0.633
Sit4 0.343
Sit5r 0.316
Sit6 0.228
```

There are multiple equivalent ways of getting the same CFA model, but with different scaling for the factor mean and variance (i.e., different means of identification). Now let’s see the model parameters when using the marker item for model identification instead. In the marker item identification method:

- The first factor loading of a factor (the first variable after the
`=~`

symbol in`lavaan`

syntax) is set to one - The factor variance is estimated
- The factor mean is set to zero
- All item intercepts and unique variances are estimated (as done in the Z-score identification method)