Example data: 635 older adults (age 80-100) self-reporting on 7 items assessing the Instrumental Activities of Daily Living (IADL) as follows:

  1. Housework (cleaning and laundry): 1=64%
  2. Bedmaking: 1=84%
  3. Cooking: 1=77%
  4. Everyday shopping: 1=66%
  5. Getting to places outside of walking distance: 1=65%
  6. Handling banking and other business: 1=73%
  7. Using the telephone 1=94%

Two versions of a response format were available:

Binary -> 0 = “needs help”, 1 = “does not need help”

Categorical -> 0 = “can’t do it”, 1=”big problems”, 2=”some problems”, 3=”no problems”

Higher scores indicate greater function. We will look at each response format in turn.

Package Installation and Loading

if (require(lavaan) == FALSE){
  install.packages("lavaan")
}
library(lavaan)
if (require(mirt) == FALSE){
  install.packages("mirt")
}
library(mirt)

Data Import into R

The data are in a text file named adl.dat orignally used in Mplus (so no column names were included at the top of the file). The file contains more items than we will use, so we select only items above from the whole file.

#read in data file (Mplus file format so having to label columns)
adlData = read.table(file = "adl.dat", header = FALSE, na.strings = ".", col.names = c("case", paste0("dpa", 1:14), paste0("dia", 1:7), paste0("cpa", 1:14), paste0("cia", 1:7)))
#select Situations items and PersonID variables
iadlDataInit = adlData[c(paste0("dia", 1:7))]
#remove cases with all items missing
removeCases = which(apply(X = iadlDataInit, MARGIN = 1, FUN = function (x){ length(which(is.na(x)))}) == 7)
iadlData = iadlDataInit[-removeCases,]

Estimation with Marginal Maximum Likelihood

We will introduce the mirt package as a method for estimating IRT models. Overall, the package is very good, but typically is used for scaling purposes (measurement rather than use of latent variables in additional model equations). We use the package to demonstrate estimating IRT models using marginal maximum likelihood. If you wish to use the latent trait estimates in secondary analyses (which you would otherwise use SEM for simultaneously), there are additional steps to take to ensure the error associated with each score is carried over to the subsequent analysese

When all the items of the model are the same type, the mirt syntax is very short. The mirt() function is used to provide estimates, with options model=1 for all items measuring the same trait and itemtype="2PL" for the two-parameter logistic model ("Rasch" is used for the 1PL shorthand). The "Rasch" designation estimates a model where the loadings are all set to one and the factor/latent trait variance is estimated) – which is an equivalent model to the one estimated below but we seek to keep the latent trait standardized. We will estimate both simultaneously here:

mirt1PLsyntax = "
IADL = 1-7
CONSTRAIN = (1-7, a1)
COV = 1
"
model1PLmirt = mirt(data = iadlData, model = mirt1PLsyntax)

Iteration: 1, Log-Lik: -1878.487, Max-Change: 1.89501
Iteration: 2, Log-Lik: -1577.569, Max-Change: 1.18305
Iteration: 3, Log-Lik: -1505.605, Max-Change: 0.58836
Iteration: 4, Log-Lik: -1483.057, Max-Change: 0.34308
Iteration: 5, Log-Lik: -1474.066, Max-Change: 0.23870
Iteration: 6, Log-Lik: -1469.990, Max-Change: 0.16868
Iteration: 7, Log-Lik: -1466.436, Max-Change: 0.06185
Iteration: 8, Log-Lik: -1465.995, Max-Change: 0.04630
Iteration: 9, Log-Lik: -1465.692, Max-Change: 0.03439
Iteration: 10, Log-Lik: -1465.184, Max-Change: 0.01246
Iteration: 11, Log-Lik: -1465.079, Max-Change: 0.01223
Iteration: 12, Log-Lik: -1464.992, Max-Change: 0.01176
Iteration: 13, Log-Lik: -1464.662, Max-Change: 0.00863
Iteration: 14, Log-Lik: -1464.646, Max-Change: 0.00732
Iteration: 15, Log-Lik: -1464.632, Max-Change: 0.00642
Iteration: 16, Log-Lik: -1464.579, Max-Change: 0.00288
Iteration: 17, Log-Lik: -1464.576, Max-Change: 0.00252
Iteration: 18, Log-Lik: -1464.574, Max-Change: 0.00216
Iteration: 19, Log-Lik: -1464.569, Max-Change: 0.00417
Iteration: 20, Log-Lik: -1464.568, Max-Change: 0.00277
Iteration: 21, Log-Lik: -1464.567, Max-Change: 0.00138
Iteration: 22, Log-Lik: -1464.566, Max-Change: 0.00109
Iteration: 23, Log-Lik: -1464.566, Max-Change: 0.00161
Iteration: 24, Log-Lik: -1464.565, Max-Change: 0.00118
Iteration: 25, Log-Lik: -1464.565, Max-Change: 0.00168
Iteration: 26, Log-Lik: -1464.564, Max-Change: 0.00142
Iteration: 27, Log-Lik: -1464.564, Max-Change: 0.00099
Iteration: 28, Log-Lik: -1464.564, Max-Change: 0.00104
Iteration: 29, Log-Lik: -1464.563, Max-Change: 0.00102
Iteration: 30, Log-Lik: -1464.563, Max-Change: 0.00086
Iteration: 31, Log-Lik: -1464.563, Max-Change: 0.00082
Iteration: 32, Log-Lik: -1464.563, Max-Change: 0.00074
Iteration: 33, Log-Lik: -1464.563, Max-Change: 0.00070
Iteration: 34, Log-Lik: -1464.563, Max-Change: 0.00076
Iteration: 35, Log-Lik: -1464.563, Max-Change: 0.00065
Iteration: 36, Log-Lik: -1464.563, Max-Change: 0.00074
Iteration: 37, Log-Lik: -1464.563, Max-Change: 0.00051
Iteration: 38, Log-Lik: -1464.563, Max-Change: 0.00065
Iteration: 39, Log-Lik: -1464.563, Max-Change: 0.00048
Iteration: 40, Log-Lik: -1464.563, Max-Change: 0.00049
Iteration: 41, Log-Lik: -1464.563, Max-Change: 0.00039
Iteration: 42, Log-Lik: -1464.563, Max-Change: 0.00031
Iteration: 43, Log-Lik: -1464.563, Max-Change: 0.00010
Iteration: 44, Log-Lik: -1464.563, Max-Change: 0.00009
model2PLmirt = mirt(data = iadlData, model = 1, itemtype = "2PL")

Iteration: 1, Log-Lik: -1878.487, Max-Change: 1.36006
Iteration: 2, Log-Lik: -1572.925, Max-Change: 1.11015
Iteration: 3, Log-Lik: -1499.005, Max-Change: 0.91115
Iteration: 4, Log-Lik: -1474.055, Max-Change: 0.72471
Iteration: 5, Log-Lik: -1464.621, Max-Change: 0.54473
Iteration: 6, Log-Lik: -1460.408, Max-Change: 0.40623
Iteration: 7, Log-Lik: -1456.776, Max-Change: 0.16294
Iteration: 8, Log-Lik: -1456.384, Max-Change: 0.13300
Iteration: 9, Log-Lik: -1456.118, Max-Change: 0.10161
Iteration: 10, Log-Lik: -1455.651, Max-Change: 0.03754
Iteration: 11, Log-Lik: -1455.561, Max-Change: 0.02954
Iteration: 12, Log-Lik: -1455.486, Max-Change: 0.02638
Iteration: 13, Log-Lik: -1455.201, Max-Change: 0.01568
Iteration: 14, Log-Lik: -1455.180, Max-Change: 0.01315
Iteration: 15, Log-Lik: -1455.164, Max-Change: 0.01114
Iteration: 16, Log-Lik: -1455.117, Max-Change: 0.00739
Iteration: 17, Log-Lik: -1455.110, Max-Change: 0.00634
Iteration: 18, Log-Lik: -1455.105, Max-Change: 0.00517
Iteration: 19, Log-Lik: -1455.092, Max-Change: 0.00484
Iteration: 20, Log-Lik: -1455.089, Max-Change: 0.00409
Iteration: 21, Log-Lik: -1455.087, Max-Change: 0.00363
Iteration: 22, Log-Lik: -1455.084, Max-Change: 0.00367
Iteration: 23, Log-Lik: -1455.082, Max-Change: 0.00288
Iteration: 24, Log-Lik: -1455.081, Max-Change: 0.00267
Iteration: 25, Log-Lik: -1455.078, Max-Change: 0.00262
Iteration: 26, Log-Lik: -1455.078, Max-Change: 0.00195
Iteration: 27, Log-Lik: -1455.077, Max-Change: 0.00180
Iteration: 28, Log-Lik: -1455.075, Max-Change: 0.00088
Iteration: 29, Log-Lik: -1455.075, Max-Change: 0.00063
Iteration: 30, Log-Lik: -1455.075, Max-Change: 0.00065
Iteration: 31, Log-Lik: -1455.074, Max-Change: 0.00024
Iteration: 32, Log-Lik: -1455.074, Max-Change: 0.00014
Iteration: 33, Log-Lik: -1455.074, Max-Change: 0.00013
Iteration: 34, Log-Lik: -1455.074, Max-Change: 0.00073
Iteration: 35, Log-Lik: -1455.074, Max-Change: 0.00016
Iteration: 36, Log-Lik: -1455.074, Max-Change: 0.00048
Iteration: 37, Log-Lik: -1455.074, Max-Change: 0.00010

Unlike lavaan, mirt does not provide a nice formatting of parameters with the summary statment. Rather, we get parts of estimates through various pieces.

The model log-likelihood and summary information is given by the show() function:

show(model1PLmirt)

Call:
mirt(data = iadlData, model = mirt1PLsyntax)

Full-information item factor analysis with 1 factor(s).
Converged within 1e-04 tolerance after 44 EM iterations.
mirt version: 1.25 
M-step optimizer: BFGS 
EM acceleration: Ramsay 
Number of rectangular quadrature: 61

Log-likelihood = -1464.563
Estimated parameters: 8 
AIC = 2945.125; AICc = 2945.355
BIC = 2980.754; SABIC = 2955.355
show(model2PLmirt)

Call:
mirt(data = iadlData, model = 1, itemtype = "2PL")

Full-information item factor analysis with 1 factor(s).
Converged within 1e-04 tolerance after 37 EM iterations.
mirt version: 1.25 
M-step optimizer: BFGS 
EM acceleration: Ramsay 
Number of rectangular quadrature: 61

Log-likelihood = -1455.074
Estimated parameters: 14 
AIC = 2938.149; AICc = 2938.826
BIC = 3000.499; SABIC = 2956.051

Also note that the model log-likelihood information does not include a test of the model against an alternative, as does a typical CFA analysis in comparing the model fit of your model to one where all parameters were estimated. This is because the saturated model in IRT is different (for models where all items are binary, it is Multivariate Bernoulli) in that the statistics of interest come in the form of the proportion of people with a given response pattern.

To see estimates, use the coef() function. Here are the estimates for the 1PL model:

coef1PL = coef(model1PLmirt)
coef1PL
$dia1
      a1     d g u
par 4.39 1.637 0 1

$dia2
      a1     d g u
par 4.39 4.651 0 1

$dia3
      a1     d g u
par 4.39 3.509 0 1

$dia4
      a1     d g u
par 4.39 1.908 0 1

$dia5
      a1     d g u
par 4.39 1.881 0 1

$dia6
      a1     d g u
par 4.39 2.988 0 1

$dia7
      a1     d g u
par 4.39 7.467 0 1

$GroupPars
    MEAN_1 COV_11
par      0      1

The coef() function returns an R list of the parameters for each item along with the structural model parameters (the $GroupPars element), which shows the mean and variance of the latent variable. For each item, there are at least four parameters listed:

Note how the item discrimination (the a1 term) is equal for all items – this is done by convention in the 1PL model.

Putting the parameters into equation form, we have a slope/intercept form of the IRT model:

\[P(Y_{si} = 1 | \theta_s) = g_i + (u_i-g_i)\frac{\exp\left(d_i + a1_i \theta_s \right)}{1+\exp\left(d_i + a1_i \theta_s \right)}\]

Another commonly used parameterization of the IRT model is called discrimination/difficulty, given by:

\[P(Y_{si} = 1 | \theta_s) = g_i + (u_i-g_i)\frac{\exp\left(a1_i \left( \theta_s - b_i \right) \right)}{1+\exp\left(a1_i \left( \theta_s - b_i \right) \right)}\]

The two parameterizations are equivalent and one can be found by re-arranging terms of the other. To get the item difficulty from the slope/intercept parameterization:

\[b_i = -\frac{d_i}{a1_i}\]

For our results, we can use the lapply function to add the item difficulties:

getDifficulty = function(itemPar){
  parnames = colnames(itemPar)
  if ("a1" %in% parnames){
    itemPar = c(itemPar, -1*itemPar[2]/itemPar[1])
    names(itemPar) = c(parnames, "b")
    return(itemPar)
  } else {
    return(itemPar)
  }
}
lapply(X = coef1PL, FUN = getDifficulty)
$dia1
        a1          d          g          u          b 
 4.3895702  1.6372853  0.0000000  1.0000000 -0.3729944 

$dia2
       a1         d         g         u         b 
 4.389570  4.651017  0.000000  1.000000 -1.059561 

$dia3
        a1          d          g          u          b 
 4.3895702  3.5086672  0.0000000  1.0000000 -0.7993191 

$dia4
        a1          d          g          u          b 
 4.3895702  1.9084551  0.0000000  1.0000000 -0.4347704 

$dia5
        a1          d          g          u          b 
 4.3895702  1.8810667  0.0000000  1.0000000 -0.4285309 

$dia6
        a1          d          g          u          b 
 4.3895702  2.9876548  0.0000000  1.0000000 -0.6806258 

$dia7
       a1         d         g         u         b 
 4.389570  7.466592  0.000000  1.000000 -1.700985 

$GroupPars
    MEAN_1 COV_11
par      0      1
itemPar = coef1PL[[1]]

For the 2PL, we can use a similar method (here condensed to display the item difficulties):

coef2PL = lapply(X = coef(model2PLmirt), FUN = getDifficulty)
coef2PL
$dia1
       a1         d         g         u         b 
 4.373935  1.606529  0.000000  1.000000 -0.367296 

$dia2
       a1         d         g         u         b 
 5.058002  5.227547  0.000000  1.000000 -1.033520 

$dia3
        a1          d          g          u          b 
 4.3647262  3.4551388  0.0000000  1.0000000 -0.7916049 

$dia4
       a1         d         g         u         b 
 7.197112  2.944432  0.000000  1.000000 -0.409113 

$dia5
       a1         d         g         u         b 
 4.273968  1.807076  0.000000  1.000000 -0.422810 

$dia6
        a1          d          g          u          b 
 3.4634209  2.4201714  0.0000000  1.0000000 -0.6987806 

$dia7
       a1         d         g         u         b 
 3.303801  5.952140  0.000000  1.000000 -1.801604 

$GroupPars
    MEAN_1 COV_11
par      0      1

As the 1PL is nested within the 2PL, we can use a likelihood ratio test to see which model is preferred. The LRT tests the null hypothesis that all item discriminations are equal against an alternative that not all are equal:

anova(model1PLmirt, model2PLmirt)

Model 1: mirt(data = iadlData, model = mirt1PLsyntax)
Model 2: mirt(data = iadlData, model = 1, itemtype = "2PL")

       AIC     AICc    SABIC      BIC    logLik     X2  df     p
1 2945.125 2945.355 2955.355 2980.754 -1464.563    NaN NaN   NaN
2 2938.149 2938.826 2956.051 3000.499 -1455.074 18.977   6 0.004

Here, the test statistic was \(\chi_6 = 18.977\) with a p-value of .004. Therefore, we reject the null hypothesis of equal slopes and conclude the 2PL fits better than the 1PL model.

The LRT, however, assumes both models have a sufficient level of absolute fit to the data. One way to tell is the use of the M2() function, which provides model fit to the 2-way tables (think item-pair covariances). Because our data has some missing responses, we have to use the impute=10 option, imputing 10 values per missing response. Here is the value for the 1PL:

M2(obj = model1PLmirt, impute = 10)
M2(obj = model2PLmirt, impute=10)

The statistics given from the M2 function are similar to those used in CFA–these show approximate model fit indices such as RMSEA, SRMR, TLI, and CFI. From these, it appears the model fits approximately (CFI and TLI near 1 but relatively poor RMSEA). To find misfitting “residuals” we need complete data and the function M2() and the imputeMissing() functions are not working. So, here is an example with complete data and the 2PL:

model2PLmirtB = mirt(data = iadlData[complete.cases(iadlData),], model = 1, itemtype = "2PL")

Iteration: 1, Log-Lik: -1817.248, Max-Change: 1.46767
Iteration: 2, Log-Lik: -1517.948, Max-Change: 1.03243
Iteration: 3, Log-Lik: -1447.274, Max-Change: 1.00582
Iteration: 4, Log-Lik: -1424.072, Max-Change: 0.74506
Iteration: 5, Log-Lik: -1415.121, Max-Change: 0.55113
Iteration: 6, Log-Lik: -1411.156, Max-Change: 0.39934
Iteration: 7, Log-Lik: -1407.784, Max-Change: 0.17155
Iteration: 8, Log-Lik: -1407.439, Max-Change: 0.13233
Iteration: 9, Log-Lik: -1407.206, Max-Change: 0.10162
Iteration: 10, Log-Lik: -1406.798, Max-Change: 0.03483
Iteration: 11, Log-Lik: -1406.725, Max-Change: 0.02867
Iteration: 12, Log-Lik: -1406.663, Max-Change: 0.02278
Iteration: 13, Log-Lik: -1406.425, Max-Change: 0.01418
Iteration: 14, Log-Lik: -1406.410, Max-Change: 0.01160
Iteration: 15, Log-Lik: -1406.397, Max-Change: 0.00953
Iteration: 16, Log-Lik: -1406.352, Max-Change: 0.00510
Iteration: 17, Log-Lik: -1406.348, Max-Change: 0.00443
Iteration: 18, Log-Lik: -1406.344, Max-Change: 0.00389
Iteration: 19, Log-Lik: -1406.331, Max-Change: 0.00247
Iteration: 20, Log-Lik: -1406.330, Max-Change: 0.00214
Iteration: 21, Log-Lik: -1406.330, Max-Change: 0.00209
Iteration: 22, Log-Lik: -1406.327, Max-Change: 0.00080
Iteration: 23, Log-Lik: -1406.327, Max-Change: 0.00070
Iteration: 24, Log-Lik: -1406.327, Max-Change: 0.00071
Iteration: 25, Log-Lik: -1406.326, Max-Change: 0.00055
Iteration: 26, Log-Lik: -1406.326, Max-Change: 0.00021
Iteration: 27, Log-Lik: -1406.326, Max-Change: 0.00013
Iteration: 28, Log-Lik: -1406.326, Max-Change: 0.00075
Iteration: 29, Log-Lik: -1406.326, Max-Change: 0.00013
Iteration: 30, Log-Lik: -1406.326, Max-Change: 0.00060
Iteration: 31, Log-Lik: -1406.326, Max-Change: 0.00053
Iteration: 32, Log-Lik: -1406.326, Max-Change: 0.00019
Iteration: 33, Log-Lik: -1406.326, Max-Change: 0.00053
Iteration: 34, Log-Lik: -1406.326, Max-Change: 0.00015
Iteration: 35, Log-Lik: -1406.326, Max-Change: 0.00037
Iteration: 36, Log-Lik: -1406.326, Max-Change: 0.00017
Iteration: 37, Log-Lik: -1406.326, Max-Change: 0.00015
Iteration: 38, Log-Lik: -1406.326, Max-Change: 0.00041
Iteration: 39, Log-Lik: -1406.326, Max-Change: 0.00038
Iteration: 40, Log-Lik: -1406.326, Max-Change: 0.00015
Iteration: 41, Log-Lik: -1406.326, Max-Change: 0.00038
Iteration: 42, Log-Lik: -1406.326, Max-Change: 0.00008
M2(obj = model2PLmirtB)
M2(obj = model2PLmirtB, residmat = TRUE)
            dia1        dia2         dia3       dia4        dia5       dia6 dia7
dia1          NA          NA           NA         NA          NA         NA   NA
dia2  0.02663977          NA           NA         NA          NA         NA   NA
dia3  0.06240477  0.06313870           NA         NA          NA         NA   NA
dia4 -0.02758525 -0.02668132 -0.038895907         NA          NA         NA   NA
dia5 -0.02215675 -0.01360095 -0.080263677 0.04226633          NA         NA   NA
dia6 -0.03183329 -0.04414976 -0.029552955 0.02706127  0.03628025         NA   NA
dia7 -0.02704504 -0.01200917  0.007826515 0.01542917 -0.00691414 0.02537108   NA

Here we see the biggest descripancy of residual covariances is that for dia5 with dia3 at -.08.

Finally, we can see plots of our model (all shown for the 2PL model). First, the item characteristic curves

plot(model2PLmirt, item=1, type = "trace", theta_lim = c(-3,3))

Next we can see the test information plot:

plot(model2PLmirt, type = "info", theta_lim = c(-3,3))

We can see that our test information peaks around a theta of -.5, meaning scores near -.5 will be the most reliable.

Finally, we can use the fscores() function to get the estimated trait scores. Note, there are several types of scores available. The standard used for score reporting is method = "EAP", which are scores that use the expected value of the posterior distribution of the score. For doing secondary analyses, multiple “plausible” scores should be used with the option plausible.draws = # where # is the number of scores to draw. We plot the density of the test scores following estimating them with fscores():

theta = fscores(object = model2PLmirt, method = "EAP")
hist(theta)

The plot shows a number of people with scores at the maximum value – but very few around the most reliable portion of the test, -.5

Here, we will plot the scores along with the standard error of the scores to show the relationship:

theta2 = fscores(object = model2PLmirt, method = "EAP", full.scores.SE = TRUE)
plot(x = theta2[complete.cases(iadlData),1], y = theta2[complete.cases(iadlData),2], xlab = expression(theta), ylab = "SE", main = "Complete Data Theta vs. SE(Theta)")

For the cases with complete data, this is the best the SE gets. When plotting all the data, you can see the impact of missing data on SE: fewer observations means a higher SE for the trait.

plot(x = theta2[,1], y = theta2[,2], xlab = expression(theta), ylab = "SE", main = "All Data Theta vs. SE(Theta)")

We can also plot the item difficulty values to get a sense of the the scale is telling us about the trait:

itemDif = unlist(lapply(X = coef2PL, FUN = function(x) return(x[5])))
plot(x = 1:7, y = itemDif[1:7], type = "l", xlab = "Item", ylab = "Item Difficulty (b)")

Here are the items, again:

  1. Housework (cleaning and laundry): 1=64%
  2. Bedmaking: 1=84%
  3. Cooking: 1=77%
  4. Everyday shopping: 1=66%
  5. Getting to places outside of walking distance: 1=65%
  6. Handling banking and other business: 1=73%
  7. Using the telephone 1=94%

Note that no items are available to measure above-average abilities well! The item difficulty for most items covers values of Theta between −1.0 to −0.5.

Estimation in lavaan: Limited Information Only

lavaan only provides limited information estimates of IRT/IFA models, which are parallel (but not necessarily equal to) Mplus’ WLSMV methods. This is a limitation of lavaan, so if ML versions of estimates are needed, Mplus will have to be used. Alternatively, the mirt package in R estimates IRT models (and many other types), but has rather limited capabilities for SEM with the models used and does not include model functions for continous observed variables and provides very few SEM fit statistics.

We will compare the one-parameter vs. two-parameter models for Binary Responses using WLSMV Probit model. Beginning with the two-parameter model.

Two-Parameter Model

The two-parameter model is called the two-parameter logisitic model if the logit link function is used (2PL), otherwise it is called the two-parameter normal ogive (2PNO). The syntax is largely identical to that use for CFA, however, item intercepts (item ~ 1 in CFA) are now replaced by item thresholds (item | t# where # is the number of the threshold, in order, from 1 through the number of categories on the item minus one).

The normal ogive model provides a model for a categorical variable’s (\(Y\) “underlying” continuous analog).

model2PSyntax = "
# loadings/discrimination parameters:
IADL =~ dia1 + dia2 + dia3 + dia4 + dia5 + dia6 + dia7
# threshholds use the | operator and start at value 1 after t:
dia1 | t1; dia2 | t1; dia3 | t1; dia4 | t1; dia5 | t1; dia6 | t1; dia7 | t1; 
# factor mean:
IADL ~ 0;
# factor variance:
IADL ~~ 1*IADL
"
model2PEstimates = sem(model = model2PSyntax, data = iadlData, ordered = c("dia1", "dia2", "dia3", "dia4", "dia5", "dia6", "dia7"),
                       mimic = "Mplus", estimator = "WLSMV", std.lv = TRUE, parameterization = "theta")
lavaan WARNING: 1 bivariate tables have empty cells; to see them, use:
                  lavInspect(fit, "zero.cell.tables")
summary(model2PEstimates, fit.measures = TRUE, rsquare = TRUE, standardized = TRUE)
lavaan (0.5-23.1097) converged normally after  79 iterations

                                                  Used       Total
  Number of observations                           609         635

  Estimator                                       DWLS      Robust
  Minimum Function Test Statistic               34.035      50.039
  Degrees of freedom                                14          14
  P-value (Chi-square)                           0.002       0.000
  Scaling correction factor                                  0.714
  Shift parameter                                            2.361
    for simple second-order correction (WLSMV)

Model test baseline model:

  Minimum Function Test Statistic            18222.540   12045.449
  Degrees of freedom                                21          21
  P-value                                        0.000       0.000

User model versus baseline model:

  Comparative Fit Index (CFI)                    0.999       0.997
  Tucker-Lewis Index (TLI)                       0.998       0.996

  Robust Comparative Fit Index (CFI)                            NA
  Robust Tucker-Lewis Index (TLI)                               NA

Root Mean Square Error of Approximation:

  RMSEA                                          0.049       0.065
  90 Percent Confidence Interval          0.028  0.070       0.046  0.085
  P-value RMSEA <= 0.05                          0.512       0.091

  Robust RMSEA                                                  NA
  90 Percent Confidence Interval                                NA     NA

Standardized Root Mean Square Residual:

  SRMR                                           0.036       0.036

Weighted Root Mean Square Residual:

  WRMR                                           1.103       1.103

Parameter Estimates:

  Information                                 Expected
  Standard Errors                           Robust.sem

Latent Variables:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  IADL =~                                                               
    dia1              2.730    0.332    8.214    0.000    2.730    0.939
    dia2              2.922    0.493    5.930    0.000    2.922    0.946
    dia3              2.785    0.381    7.307    0.000    2.785    0.941
    dia4              3.698    0.590    6.266    0.000    3.698    0.965
    dia5              2.449    0.291    8.416    0.000    2.449    0.926
    dia6              1.955    0.220    8.902    0.000    1.955    0.890
    dia7              1.511    0.278    5.425    0.000    1.511    0.834

Intercepts:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
    IADL              0.000                               0.000    0.000
   .dia1              0.000                               0.000    0.000
   .dia2              0.000                               0.000    0.000
   .dia3              0.000                               0.000    0.000
   .dia4              0.000                               0.000    0.000
   .dia5              0.000                               0.000    0.000
   .dia6              0.000                               0.000    0.000
   .dia7              0.000                               0.000    0.000

Thresholds:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
    dia1|t1          -1.020    0.185   -5.524    0.000   -1.020   -0.351
    dia2|t1          -3.080    0.480   -6.413    0.000   -3.080   -0.997
    dia3|t1          -2.220    0.305   -7.267    0.000   -2.220   -0.750
    dia4|t1          -1.581    0.306   -5.169    0.000   -1.581   -0.413
    dia5|t1          -1.092    0.176   -6.196    0.000   -1.092   -0.413
    dia6|t1          -1.439    0.169   -8.515    0.000   -1.439   -0.655
    dia7|t1          -2.936    0.374   -7.850    0.000   -2.936   -1.621

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
    IADL              1.000                               1.000    1.000
   .dia1              1.000                               1.000    0.118
   .dia2              1.000                               1.000    0.105
   .dia3              1.000                               1.000    0.114
   .dia4              1.000                               1.000    0.068
   .dia5              1.000                               1.000    0.143
   .dia6              1.000                               1.000    0.207
   .dia7              1.000                               1.000    0.305

Scales y*:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
    dia1              0.344                               0.344    1.000
    dia2              0.324                               0.324    1.000
    dia3              0.338                               0.338    1.000
    dia4              0.261                               0.261    1.000
    dia5              0.378                               0.378    1.000
    dia6              0.455                               0.455    1.000
    dia7              0.552                               0.552    1.000

R-Square:
                   Estimate
    dia1              0.882
    dia2              0.895
    dia3              0.886
    dia4              0.932
    dia5              0.857
    dia6              0.793
    dia7              0.695

We can also inspect the residual polychoric correlation matrix to investigate model misfit. Note these are the raw residuals, not a normalized or standardized version. Below that, the modification indices are displayed.

resid(model2PEstimates)
$type
[1] "raw"

$cov
     dia1   dia2   dia3   dia4   dia5   dia6   dia7  
dia1  0.000                                          
dia2  0.027  0.000                                   
dia3  0.037  0.029  0.000                            
dia4 -0.022 -0.042 -0.045  0.000                     
dia5 -0.030 -0.030 -0.100  0.030  0.000              
dia6 -0.043 -0.048 -0.039  0.026  0.028  0.000       
dia7 -0.110 -0.003  0.011  0.033 -0.031  0.065  0.000

$mean
dia1 dia2 dia3 dia4 dia5 dia6 dia7 
   0    0    0    0    0    0    0 

$th
      dia1|t1       dia2|t1       dia3|t1       dia4|t1       dia5|t1       dia6|t1       dia7|t1 
-1.142277e-08 -7.507156e-08  3.858897e-08  1.378347e-10  5.264907e-08 -4.490636e-08  2.146585e-08 
modificationindices(model2PEstimates, sort. = TRUE)
    lhs op  rhs    mi mi.scaled    epc sepc.lv sepc.all sepc.nox
53 dia4 ~~ dia5 9.157    12.828  0.889   0.889    0.088    0.088
50 dia3 ~~ dia5 8.877    12.435 -0.995  -0.995   -0.127   -0.127
39 dia1 ~~ dia3 7.766    10.879  0.719   0.719    0.084    0.084
49 dia3 ~~ dia4 4.328     6.062 -0.816  -0.816   -0.072   -0.072
44 dia2 ~~ dia3 4.060     5.688  0.581   0.581    0.064    0.064
45 dia2 ~~ dia4 2.852     3.995 -0.743  -0.743   -0.063   -0.063
43 dia1 ~~ dia7 2.334     3.270 -0.674  -0.674   -0.128   -0.128
54 dia4 ~~ dia6 2.312     3.239  0.422   0.422    0.050    0.050
42 dia1 ~~ dia6 2.094     2.933 -0.370  -0.370   -0.058   -0.058
47 dia2 ~~ dia6 1.986     2.781 -0.422  -0.422   -0.062   -0.062
38 dia1 ~~ dia2 1.879     2.632  0.421   0.421    0.047    0.047
40 dia1 ~~ dia4 1.862     2.608 -0.475  -0.475   -0.043   -0.043
41 dia1 ~~ dia5 1.801     2.523 -0.358  -0.358   -0.047   -0.047
58 dia6 ~~ dia7 1.644     2.303  0.338   0.338    0.085    0.085
51 dia3 ~~ dia6 1.640     2.298 -0.341  -0.341   -0.052   -0.052
56 dia5 ~~ dia6 1.588     2.224  0.260   0.260    0.045    0.045
46 dia2 ~~ dia5 1.091     1.529 -0.335  -0.335   -0.041   -0.041
57 dia5 ~~ dia7 0.227     0.319 -0.178  -0.178   -0.037   -0.037
55 dia4 ~~ dia7 0.127     0.179  0.254   0.254    0.037    0.037
52 dia3 ~~ dia7 0.052     0.073  0.081   0.081    0.015    0.015
48 dia2 ~~ dia7 0.005     0.007 -0.026  -0.026   -0.005   -0.005

From the modifiation indices, we see several trends we saw with the M2() residuals in R: that items 5 and 3 need some additional help (as do items 1 and 3 and items 5 and 4).

ICC Plots

lavaanCatItemPlot = function(lavObject, varname, sds = 3){
  output = inspect(object = lavObject, what = "est")
  if (!varname %in% rownames(output$lambda)) stop(paste(varname, "not found in lavaan object"))
  if (dim(output$lambda)[2]>1) stop("plots only given for one factor models")
  
  itemloading = output$lambda[which(rownames(output$lambda) == varname),1]
  itemthresholds = output$tau[grep(pattern = varname, x = rownames(output$tau))]
  
  factorname = colnames(output$lambda)
  factormean = output$alpha[which(rownames(output$alpha) == factorname)]
  factorvar = output$psi[which(rownames(output$psi) == factorname)]
  
  factormin = factormean - 3*sqrt(factorvar)
  factormax = factormean + 3*sqrt(factorvar)
  
  factorX = seq(factormin, factormax, .01)
  itemloc = which(lavObject@Data@ov$name == varname)      
  itemlevels = unlist(strsplit(x = lavObject@Data@ov$lnam[itemloc], split = "\\|"))  
  if (length(itemthresholds)>1){
    
    plotdata = NULL
    plotdata2 = NULL
    itemY = NULL
    itemY2 = NULL
    itemX = NULL
    itemText = NULL
    for (level in 1:length(itemthresholds)){
      
      itemY = pnorm(q = -1*itemthresholds[level] + itemloading*factorX)
      itemY2 = cbind(itemY2, pnorm(q = -1*itemthresholds[level] + itemloading*factorX))
      itemText = paste0("P(", varname, " > ", itemlevels[level], ")")
      itemText2 = paste0("P(", varname, " = ", itemlevels[level], ")")
      plotdata = rbind(plotdata, data.frame(factor = factorX, prob = itemY, plot = itemText))
      
      if (level == 1){
        plotdata2 = data.frame(factor = factorX, plot = itemText2, prob = matrix(1, nrow = dim(itemY2)[1], ncol=1) - itemY2[,level])
      } else if (level == length(itemthresholds)){
        plotdata2 = rbind(plotdata2, data.frame(factor = factorX, plot = itemText2, prob = itemY2[,level-1] - itemY2[,level]))
        plotdata2 = rbind(plotdata2, data.frame(factor = factorX, plot = paste0("P(", varname, " = ", itemlevels[level+1], ")"), prob = itemY2[,level]))                  
      } else {
        plotdata2 = rbind(plotdata2, data.frame(factor = factorX, plot = itemText2, prob = itemY2[,level-1] - itemY2[,level]))
      }
      
    }
    
    names(plotdata) = c(factorname , "Probability", "Cumulative")
    ggplot(data = plotdata, aes_string(x = factorname, y = "Probability", colour = "Cumulative")) + geom_line(size = 2)
    
    names(plotdata2) = c(factorname, "Response", "Probability")
    ggplot(data = plotdata2, aes_string(x = factorname, y = "Probability", colour = "Response")) + geom_line(size = 2)
  } else {
    
    itemY = pnorm(q = -1*itemthresholds[1] + itemloading*factorX)
    itemText2 = paste0("P(", varname, " = ", itemlevels[2], ")")
    plotdata = data.frame(factor = factorX, prob = itemY, plot = itemText2)
    
    names(plotdata) = c(factorname , "Probability", "Response")
    ggplot(data = plotdata, aes_string(x = factorname, y = "Probability", colour = "Response")) + geom_line(size = 2)
    
  }
}
lavaanCatItemPlot(lavObject = model2PEstimates, varname = "dia2", sds = 3)

Conversion to IRT Paramterization (Discrimination/Difficulty)

convertTheta2IRT = function(lavObject){
  if (!lavObject@Options$parameterization == "theta") stop("your model is not estimated with parameterization='theta'")
  
  output = inspect(object = lavObject, what = "est")
  if (ncol(output$lambda)>1) stop("IRT conversion is only valid for one dimensional factor models. Your model has more than one dimension.")
  
  a = output$lambda
  b = output$tau/output$lambda
  return(list(a = a, b=b))
}
convertTheta2IRT(lavObject = model2PEstimates)
$a
      IADL
dia1 2.730
dia2 2.922
dia3 2.785
dia4 3.698
dia5 2.449
dia6 1.955
dia7 1.511

$b
        thrshl
dia1|t1 -0.374
dia2|t1 -1.054
dia3|t1 -0.797
dia4|t1 -0.428
dia5|t1 -0.446
dia6|t1 -0.736
dia7|t1 -1.943

One-Parameter Normal Ogive Model

To estimate the 1PNO model in lavaan, we use the following syntax. The summary, residuals, and modification indices are displayed subsequently.

model1PSyntax = "
# loadings/discrimination parameters:
IADL =~ loading*dia1 + loading*dia2 + loading*dia3 + loading*dia4 + loading*dia5 + loading*dia6 + loading*dia7
# threshholds use the | operator and start at value 1 after t:
dia1 | t1; dia2 | t1; dia3 | t1; dia4 | t1; dia5 | t1; dia6 | t1; dia7 | t1; 
# factor mean:
IADL ~ 0;
# factor variance:
IADL ~~ 1*IADL
"
model1PEstimates = sem(model = model1PSyntax, data = iadlData, ordered = c("dia1", "dia2", "dia3", "dia4", "dia5", "dia6", "dia7"),
                       mimic = "Mplus", estimator = "WLSMV", std.lv = TRUE, parameterization = "theta")
lavaan WARNING: 1 bivariate tables have empty cells; to see them, use:
                  lavInspect(fit, "zero.cell.tables")
summary(model1PEstimates, fit.measures = TRUE, rsquare = TRUE, standardized = TRUE)
lavaan (0.5-23.1097) converged normally after  27 iterations

                                                  Used       Total
  Number of observations                           609         635

  Estimator                                       DWLS      Robust
  Minimum Function Test Statistic               62.072      64.108
  Degrees of freedom                                20          20
  P-value (Chi-square)                           0.000       0.000
  Scaling correction factor                                  1.037
  Shift parameter                                            4.227
    for simple second-order correction (WLSMV)

Model test baseline model:

  Minimum Function Test Statistic            18222.540   12045.449
  Degrees of freedom                                21          21
  P-value                                        0.000       0.000

User model versus baseline model:

  Comparative Fit Index (CFI)                    0.998       0.996
  Tucker-Lewis Index (TLI)                       0.998       0.996

  Robust Comparative Fit Index (CFI)                            NA
  Robust Tucker-Lewis Index (TLI)                               NA

Root Mean Square Error of Approximation:

  RMSEA                                          0.059       0.060
  90 Percent Confidence Interval          0.043  0.076       0.044  0.077
  P-value RMSEA <= 0.05                          0.175       0.141

  Robust RMSEA                                                  NA
  90 Percent Confidence Interval                                NA     NA

Standardized Root Mean Square Residual:

  SRMR                                           0.058       0.058

Weighted Root Mean Square Residual:

  WRMR                                           1.489       1.489

Parameter Estimates:

  Information                                 Expected
  Standard Errors                           Robust.sem

Latent Variables:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  IADL =~                                                               
    dia1    (ldng)    2.630    0.166   15.830    0.000    2.630    0.935
    dia2    (ldng)    2.630    0.166   15.830    0.000    2.630    0.935
    dia3    (ldng)    2.630    0.166   15.830    0.000    2.630    0.935
    dia4    (ldng)    2.630    0.166   15.830    0.000    2.630    0.935
    dia5    (ldng)    2.630    0.166   15.830    0.000    2.630    0.935
    dia6    (ldng)    2.630    0.166   15.830    0.000    2.630    0.935
    dia7    (ldng)    2.630    0.166   15.830    0.000    2.630    0.935

Intercepts:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
    IADL              0.000                               0.000    0.000
   .dia1              0.000                               0.000    0.000
   .dia2              0.000                               0.000    0.000
   .dia3              0.000                               0.000    0.000
   .dia4              0.000                               0.000    0.000
   .dia5              0.000                               0.000    0.000
   .dia6              0.000                               0.000    0.000
   .dia7              0.000                               0.000    0.000

Thresholds:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
    dia1|t1          -0.987    0.158   -6.253    0.000   -0.987   -0.351
    dia2|t1          -2.807    0.190  -14.758    0.000   -2.807   -0.997
    dia3|t1          -2.111    0.177  -11.903    0.000   -2.111   -0.750
    dia4|t1          -1.161    0.161   -7.193    0.000   -1.161   -0.413
    dia5|t1          -1.161    0.160   -7.272    0.000   -1.161   -0.413
    dia6|t1          -1.844    0.172  -10.737    0.000   -1.844   -0.655
    dia7|t1          -4.560    0.291  -15.697    0.000   -4.560   -1.621

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
    IADL              1.000                               1.000    1.000
   .dia1              1.000                               1.000    0.126
   .dia2              1.000                               1.000    0.126
   .dia3              1.000                               1.000    0.126
   .dia4              1.000                               1.000    0.126
   .dia5              1.000                               1.000    0.126
   .dia6              1.000                               1.000    0.126
   .dia7              1.000                               1.000    0.126

Scales y*:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
    dia1              0.355                               0.355    1.000
    dia2              0.355                               0.355    1.000
    dia3              0.355                               0.355    1.000
    dia4              0.355                               0.355    1.000
    dia5              0.355                               0.355    1.000
    dia6              0.355                               0.355    1.000
    dia7              0.355                               0.355    1.000

R-Square:
                   Estimate
    dia1              0.874
    dia2              0.874
    dia3              0.874
    dia4              0.874
    dia5              0.874
    dia6              0.874
    dia7              0.874
resid(object = model1PEstimates)
$type
[1] "raw"

$cov
     dia1   dia2   dia3   dia4   dia5   dia6   dia7  
dia1  0.000                                          
dia2  0.042  0.000                                   
dia3  0.047  0.045  0.000                            
dia4  0.011 -0.002 -0.010  0.000                     
dia5 -0.035 -0.028 -0.102  0.050  0.000              
dia6 -0.081 -0.080 -0.075  0.011 -0.022  0.000       
dia7 -0.201 -0.088 -0.078 -0.036 -0.133 -0.066  0.000

$mean
dia1 dia2 dia3 dia4 dia5 dia6 dia7 
   0    0    0    0    0    0    0 

$th
      dia1|t1       dia2|t1       dia3|t1       dia4|t1       dia5|t1       dia6|t1       dia7|t1 
-1.002357e-06 -9.511829e-08  1.171064e-06  4.800199e-07  4.800199e-07 -2.711188e-06 -3.423275e-06 
modificationindices(model1PEstimates, sort. = TRUE)
    lhs op  rhs     mi mi.scaled    epc sepc.lv sepc.all sepc.nox
23 dia7 ~~ dia7 13.706    13.222  1.863   1.863    0.235    0.235
59 dia4 ~~ dia5 10.157     9.799  0.462   0.462    0.058    0.058
22 dia6 ~~ dia6  9.921     9.571  0.869   0.869    0.110    0.110
56 dia3 ~~ dia5  7.506     7.241 -0.831  -0.831   -0.105   -0.105
49 dia1 ~~ dia7  6.714     6.477 -1.596  -1.596   -0.202   -0.202
20 dia4 ~~ dia4  6.692     6.456 -0.617  -0.617   -0.078   -0.078
45 dia1 ~~ dia3  6.172     5.954  0.413   0.413    0.052    0.052
48 dia1 ~~ dia6  5.665     5.465 -0.663  -0.663   -0.084   -0.084
50 dia2 ~~ dia3  5.065     4.886  0.397   0.397    0.050    0.050
57 dia3 ~~ dia6  4.638     4.474 -0.615  -0.615   -0.078   -0.078
53 dia2 ~~ dia6  4.314     4.161 -0.650  -0.650   -0.082   -0.082
63 dia5 ~~ dia7  3.514     3.390 -1.058  -1.058   -0.134   -0.134
44 dia1 ~~ dia2  2.742     2.645  0.350   0.350    0.044    0.044
54 dia2 ~~ dia7  2.415     2.329 -0.706  -0.706   -0.089   -0.089
58 dia3 ~~ dia7  1.835     1.770 -0.621  -0.621   -0.078   -0.078
47 dia1 ~~ dia5  1.616     1.559 -0.288  -0.288   -0.036   -0.036
64 dia6 ~~ dia7  1.298     1.253 -0.528  -0.528   -0.067   -0.067
52 dia2 ~~ dia5  0.707     0.682 -0.227  -0.227   -0.029   -0.029
18 dia2 ~~ dia2  0.660     0.636 -0.212  -0.212   -0.027   -0.027
62 dia5 ~~ dia6  0.647     0.624 -0.183  -0.183   -0.023   -0.023
17 dia1 ~~ dia1  0.598     0.577 -0.190  -0.190   -0.024   -0.024
19 dia3 ~~ dia3  0.496     0.479 -0.176  -0.176   -0.022   -0.022
46 dia1 ~~ dia4  0.261     0.252  0.094   0.094    0.012    0.012
60 dia4 ~~ dia6  0.252     0.243  0.098   0.098    0.012    0.012
55 dia3 ~~ dia4  0.149     0.144 -0.086  -0.086   -0.011   -0.011
61 dia4 ~~ dia7  0.134     0.130 -0.283  -0.283   -0.036   -0.036
21 dia5 ~~ dia5  0.081     0.078  0.071   0.071    0.009    0.009
51 dia2 ~~ dia4  0.006     0.006 -0.019  -0.019   -0.002   -0.002

To compare models, we use the anova() function, which uses the correct test for us. Here we see the 2PL is preferred to the 1PL, again.

anova(model1PEstimates, model2PEstimates)
Scaled Chi Square Difference Test (method = "satorra.2000")

                 Df AIC BIC  Chisq Chisq diff Df diff Pr(>Chisq)   
model2PEstimates 14         34.035                                 
model1PEstimates 20         62.072     18.013  5.0417    0.00303 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
---
title: "EPSY 906/CLDP 948 Example 5"
output: html_notebook
---

Example data: 635 older adults (age 80-100) self-reporting on 7 items assessing the Instrumental Activities of Daily Living (IADL) as follows:

1. Housework (cleaning and laundry): 1=64%
2. Bedmaking: 1=84%
3. Cooking: 1=77%
4. Everyday shopping: 1=66%
5. Getting to places outside of walking distance: 1=65%
6. Handling banking and other business: 1=73%	
7. Using the telephone 1=94%

Two versions of a response format were available:

Binary -> 0 = “needs help”, 1 = “does not need help”

Categorical -> 0 = “can’t do it”, 1=”big problems”, 2=”some problems”, 3=”no problems”

Higher scores indicate greater function. We will look at each response format in turn.

### Package Installation and Loading 

```{r setup, include=TRUE}
if (require(lavaan) == FALSE){
  install.packages("lavaan")
}
library(lavaan)

if (require(mirt) == FALSE){
  install.packages("mirt")
}
library(mirt)

```

### Data Import into R
The data are in a text file named ```adl.dat``` orignally used in Mplus (so no column names were included at the top of the file). The file contains more items than we will use, so we select only items above from the whole file.

```{r data, include=TRUE}

#read in data file (Mplus file format so having to label columns)
adlData = read.table(file = "adl.dat", header = FALSE, na.strings = ".", col.names = c("case", paste0("dpa", 1:14), paste0("dia", 1:7), paste0("cpa", 1:14), paste0("cia", 1:7)))

#select Situations items and PersonID variables
iadlDataInit = adlData[c(paste0("dia", 1:7))]

#remove cases with all items missing
removeCases = which(apply(X = iadlDataInit, MARGIN = 1, FUN = function (x){ length(which(is.na(x)))}) == 7)

iadlData = iadlDataInit[-removeCases,]
```

### Estimation with Marginal Maximum Likelihood

We will introduce the `mirt` package as a method for estimating IRT models. Overall, the package is very good, but typically is used for scaling purposes (measurement rather than use of latent variables in additional model equations). We use the package to demonstrate estimating IRT models using marginal maximum likelihood. If you wish to use the latent trait estimates in secondary analyses (which you would otherwise use SEM for simultaneously), there are additional steps to take to ensure the error associated with each score is carried over to the subsequent analysese

When all the items of the model are the same type, the `mirt` syntax is very short. The `mirt()` function is used to provide estimates, with options `model=1` for all items measuring the same trait and `itemtype="2PL"` for the two-parameter logistic model (`"Rasch"` is used for the `1PL` shorthand). The `"Rasch"` designation estimates a model where the loadings are all set to one and the factor/latent trait variance is estimated) -- which is an equivalent model to the one estimated below but we seek to keep the latent trait standardized. We will estimate both simultaneously here:

```{r mirt2PL, include=TRUE}
mirt1PLsyntax = "
IADL = 1-7
CONSTRAIN = (1-7, a1)
COV = 1
"

model1PLmirt = mirt(data = iadlData, model = mirt1PLsyntax)


model2PLmirt = mirt(data = iadlData, model = 1, itemtype = "2PL")
```

Unlike `lavaan`, `mirt` does not provide a nice formatting of parameters with the summary statment. Rather, we get parts of estimates through various pieces.

The model log-likelihood and summary information is given by the `show()` function:

```{r mirtSum, include=TRUE}
show(model1PLmirt)
show(model2PLmirt)
```

Also note that the model log-likelihood information does not include a test of the model against an alternative, as does a typical CFA analysis in comparing the model fit of your model to one where all parameters were estimated. This is because the saturated model in IRT is different (for models where all items are binary, it is Multivariate Bernoulli) in that the statistics of interest come in the form of the proportion of people with a given _response pattern_.

To see estimates, use the `coef()` function. Here are the estimates for the 1PL model:

```{r show1PL, include=TRUE}
coef1PL = coef(model1PLmirt)
coef1PL
```

The `coef()` function returns an R list of the parameters for each item along with the structural model parameters (the `$GroupPars` element), which shows the mean and variance of the latent variable. For each item, there are at least four parameters listed:

* `a1`: the factor loading/item discrimination/slope: the change in the log-odds of $P(Y_{si}=1)$ per one unit change in $\theta_s$ (approximate change in the 3/4PL models)
* `d`: the item intercept the expected value of the log-odds of $P(Y_{si}=1)$ for $\theta_s = 0$ (approximate expectation in the 3/4PL models)
* `g`: the lower asymptote or guessing parameter (for 3PL this will be non-zero)
* `u`: the upper asymptote (for `4PL` or, in mirt, a type of a 3PL, this will be non-zero)

Note how the item discrimination (the `a1` term) is equal for all items -- this is done by convention in the 1PL model.

Putting the parameters into equation form, we have a slope/intercept form of the IRT model:

$$P(Y_{si} = 1 | \theta_s) = g_i + (u_i-g_i)\frac{\exp\left(d_i + a1_i \theta_s \right)}{1+\exp\left(d_i + a1_i \theta_s \right)}$$

Another commonly used parameterization of the IRT model is called discrimination/difficulty, given by:

$$P(Y_{si} = 1 | \theta_s) = g_i + (u_i-g_i)\frac{\exp\left(a1_i \left( \theta_s - b_i \right) \right)}{1+\exp\left(a1_i \left( \theta_s - b_i \right) \right)}$$

The two parameterizations are equivalent and one can be found by re-arranging terms of the other. To get the item difficulty from the slope/intercept parameterization:

$$b_i = -\frac{d_i}{a1_i}$$

For our results, we can use the `lapply` function to add the item difficulties:

```{r mirt1PLdifficulty, include=TRUE}
getDifficulty = function(itemPar){
  parnames = colnames(itemPar)
  if ("a1" %in% parnames){
    itemPar = c(itemPar, -1*itemPar[2]/itemPar[1])
    names(itemPar) = c(parnames, "b")
    return(itemPar)
  } else {
    return(itemPar)
  }
}

lapply(X = coef1PL, FUN = getDifficulty)

itemPar = coef1PL[[1]]
```

For the 2PL, we can use a similar method (here condensed to display the item difficulties):
```{r model2PLcoef, include=TRUE}
coef2PL = lapply(X = coef(model2PLmirt), FUN = getDifficulty)
coef2PL
```

As the 1PL is nested within the 2PL, we can use a likelihood ratio test to see which model is preferred. The LRT tests the null hypothesis that all item discriminations are equal against an alternative that not all are equal:

```{r mirtLRT, include=TRUE}
anova(model1PLmirt, model2PLmirt)
```

Here, the test statistic was $\chi_6 = 18.977$ with a p-value of .004. Therefore, we reject the null hypothesis of equal slopes and conclude the 2PL fits better than the 1PL model. 

The LRT, however, assumes both models have a sufficient level of absolute fit to the data. One way to tell is the use of the `M2()` function, which provides model fit to the 2-way tables (think item-pair covariances). Because our data has some missing responses, we have to use the `impute=10` option, imputing 10 values per missing response. Here is the value for the 1PL:

```{r model1PLM2, include=TRUE}
M2(obj = model1PLmirt, impute = 10)
```

```{r model2PLM2, include=TRUE}
M2(obj = model2PLmirt, impute=10)
```

The statistics given from the M2 function are similar to those used in CFA--these show approximate model fit indices such as RMSEA, SRMR, TLI, and CFI. From these, it appears the model fits approximately (CFI and TLI near 1 but relatively poor RMSEA). To find misfitting "residuals" we need complete data and the function `M2()` and the `imputeMissing()` functions are not working. So, here is an example with complete data and the 2PL:

```{r m2resid, include=TRUE}
model2PLmirtB = mirt(data = iadlData[complete.cases(iadlData),], model = 1, itemtype = "2PL")
M2(obj = model2PLmirtB)

M2(obj = model2PLmirtB, residmat = TRUE)
```
Here we see the biggest descripancy of residual covariances is that for dia5 with dia3 at -.08.

Finally, we can see plots of our model (all shown for the 2PL model). First, the item characteristic curves

```{r plots, include=TRUE}
plot(model2PLmirt, item=1, type = "trace", theta_lim = c(-3,3))
```

Next we can see the test information plot:
```{r plots2, include=TRUE}
plot(model2PLmirt, type = "info", theta_lim = c(-3,3))
```
We can see that our test information peaks around a theta of -.5, meaning scores near -.5 will be the most reliable.

Finally, we can use the `fscores()` function to get the estimated trait scores. Note, there are several types of scores available. The standard used for score reporting is `method = "EAP"`, which are scores that use the expected value of the posterior distribution of the score. For doing secondary analyses, multiple "plausible" scores should be used with the option `plausible.draws = #` where # is the number of scores to draw. We plot the density of the test scores following estimating them with `fscores()`:

```{r fscores, include=TRUE}
theta = fscores(object = model2PLmirt, method = "EAP")
hist(theta)
```
The plot shows a number of people with scores at the maximum value -- but very few around the most reliable portion of the test, -.5

Here, we will plot the scores along with the standard error of the scores to show the relationship:

```{r fscoresSE, include=TRUE}
theta2 = fscores(object = model2PLmirt, method = "EAP", full.scores.SE = TRUE)
plot(x = theta2[complete.cases(iadlData),1], y = theta2[complete.cases(iadlData),2], xlab = expression(theta), ylab = "SE", main = "Complete Data Theta vs. SE(Theta)")
```

For the cases with complete data, this is the best the SE gets. When plotting all the data, you can see the impact of missing data on SE: fewer observations means a higher SE for the trait.

```{r plotSE2, include=TRUE}
plot(x = theta2[,1], y = theta2[,2], xlab = expression(theta), ylab = "SE", main = "All Data Theta vs. SE(Theta)")
```

We can also plot the item difficulty values to get a sense of the the scale is telling us about the trait:

```{r difplot, include=TRUE}
itemDif = unlist(lapply(X = coef2PL, FUN = function(x) return(x[5])))
plot(x = 1:7, y = itemDif[1:7], type = "l", xlab = "Item", ylab = "Item Difficulty (b)")
```
Here are the items, again:

1. Housework (cleaning and laundry): 1=64%
2. Bedmaking: 1=84%
3. Cooking: 1=77%
4. Everyday shopping: 1=66%
5. Getting to places outside of walking distance: 1=65%
6. Handling banking and other business: 1=73%	
7. Using the telephone 1=94%

Note that no items are available to measure above-average abilities well! The item difficulty for most items covers values of Theta between −1.0 to −0.5.

### Estimation in `lavaan`: Limited Information Only

`lavaan` only provides limited information estimates of IRT/IFA models, which are parallel (but not necessarily equal to) Mplus' WLSMV methods. This is a limitation of `lavaan`, so if ML versions of estimates are needed, Mplus will have to be used. Alternatively, the `mirt` package in R estimates IRT models (and many other types), but has rather limited capabilities for SEM with the models used and does not include model functions for continous observed variables and provides very few SEM fit statistics. 

We will compare the one-parameter vs. two-parameter models for Binary Responses using WLSMV Probit model. Beginning with the two-parameter model.

#### Two-Parameter Model 

The two-parameter model is called the two-parameter logisitic model if the logit link function is used (2PL), otherwise it is called the two-parameter normal ogive (2PNO). The syntax is largely identical to that use for CFA, however, item intercepts (`item ~ 1` in CFA) are now replaced by item thresholds (`item | t#` where `#` is the number of the threshold, in order, from 1 through the number of categories on the item minus one).

The normal ogive model provides a model for a categorical variable's ($Y$ "underlying" continuous analog).

```{r lavaan2PL, include=TRUE}
model2PSyntax = "

# loadings/discrimination parameters:
IADL =~ dia1 + dia2 + dia3 + dia4 + dia5 + dia6 + dia7

# threshholds use the | operator and start at value 1 after t:
dia1 | t1; dia2 | t1; dia3 | t1; dia4 | t1; dia5 | t1; dia6 | t1; dia7 | t1; 

# factor mean:
IADL ~ 0;

# factor variance:
IADL ~~ 1*IADL

"

model2PEstimates = sem(model = model2PSyntax, data = iadlData, ordered = c("dia1", "dia2", "dia3", "dia4", "dia5", "dia6", "dia7"),
                       mimic = "Mplus", estimator = "WLSMV", std.lv = TRUE, parameterization = "theta")
summary(model2PEstimates, fit.measures = TRUE, rsquare = TRUE, standardized = TRUE)

```
We can also inspect the residual polychoric correlation matrix to investigate model misfit. Note these are the raw residuals, not a normalized or standardized version. Below that, the modification indices are displayed.

```{r lavaan2PL2, include=TRUE}
resid(model2PEstimates)
modificationindices(model2PEstimates, sort. = TRUE)
```

From the modifiation indices, we see several trends we saw with the `M2()` residuals in R: that items 5 and 3 need some additional help (as do items 1 and 3 and items 5 and 4).

#### ICC Plots

```{r plotICCs, include=TRUE}
lavaanCatItemPlot = function(lavObject, varname, sds = 3){
  output = inspect(object = lavObject, what = "est")
  if (!varname %in% rownames(output$lambda)) stop(paste(varname, "not found in lavaan object"))
  if (dim(output$lambda)[2]>1) stop("plots only given for one factor models")
  
  itemloading = output$lambda[which(rownames(output$lambda) == varname),1]
  itemthresholds = output$tau[grep(pattern = varname, x = rownames(output$tau))]
  
  factorname = colnames(output$lambda)
  factormean = output$alpha[which(rownames(output$alpha) == factorname)]
  factorvar = output$psi[which(rownames(output$psi) == factorname)]
  
  factormin = factormean - 3*sqrt(factorvar)
  factormax = factormean + 3*sqrt(factorvar)
  
  factorX = seq(factormin, factormax, .01)
  itemloc = which(lavObject@Data@ov$name == varname)      
  itemlevels = unlist(strsplit(x = lavObject@Data@ov$lnam[itemloc], split = "\\|"))  
  if (length(itemthresholds)>1){

    
    plotdata = NULL
    plotdata2 = NULL
    itemY = NULL
    itemY2 = NULL
    itemX = NULL
    itemText = NULL
    for (level in 1:length(itemthresholds)){
      
      itemY = pnorm(q = -1*itemthresholds[level] + itemloading*factorX)
      itemY2 = cbind(itemY2, pnorm(q = -1*itemthresholds[level] + itemloading*factorX))
      itemText = paste0("P(", varname, " > ", itemlevels[level], ")")
      itemText2 = paste0("P(", varname, " = ", itemlevels[level], ")")
      plotdata = rbind(plotdata, data.frame(factor = factorX, prob = itemY, plot = itemText))
      
      if (level == 1){
        plotdata2 = data.frame(factor = factorX, plot = itemText2, prob = matrix(1, nrow = dim(itemY2)[1], ncol=1) - itemY2[,level])
      } else if (level == length(itemthresholds)){
        plotdata2 = rbind(plotdata2, data.frame(factor = factorX, plot = itemText2, prob = itemY2[,level-1] - itemY2[,level]))
        plotdata2 = rbind(plotdata2, data.frame(factor = factorX, plot = paste0("P(", varname, " = ", itemlevels[level+1], ")"), prob = itemY2[,level]))                  
      } else {
        plotdata2 = rbind(plotdata2, data.frame(factor = factorX, plot = itemText2, prob = itemY2[,level-1] - itemY2[,level]))
      }
      
    }
    
    names(plotdata) = c(factorname , "Probability", "Cumulative")
    ggplot(data = plotdata, aes_string(x = factorname, y = "Probability", colour = "Cumulative")) + geom_line(size = 2)
    
    names(plotdata2) = c(factorname, "Response", "Probability")
    ggplot(data = plotdata2, aes_string(x = factorname, y = "Probability", colour = "Response")) + geom_line(size = 2)
  } else {
    
    itemY = pnorm(q = -1*itemthresholds[1] + itemloading*factorX)
    itemText2 = paste0("P(", varname, " = ", itemlevels[2], ")")
    plotdata = data.frame(factor = factorX, prob = itemY, plot = itemText2)
    
    names(plotdata) = c(factorname , "Probability", "Response")
    ggplot(data = plotdata, aes_string(x = factorname, y = "Probability", colour = "Response")) + geom_line(size = 2)
    
  }
}

lavaanCatItemPlot(lavObject = model2PEstimates, varname = "dia2", sds = 3)
```

#### Conversion to IRT Paramterization (Discrimination/Difficulty)

```{r convert, include=TRUE}
convertTheta2IRT = function(lavObject){
  if (!lavObject@Options$parameterization == "theta") stop("your model is not estimated with parameterization='theta'")
  
  output = inspect(object = lavObject, what = "est")
  if (ncol(output$lambda)>1) stop("IRT conversion is only valid for one dimensional factor models. Your model has more than one dimension.")
  
  a = output$lambda
  b = output$tau/output$lambda
  return(list(a = a, b=b))
}

convertTheta2IRT(lavObject = model2PEstimates)
```

#### One-Parameter Normal Ogive Model

To estimate the 1PNO model in `lavaan`, we use the following syntax. The summary, residuals, and modification indices are displayed subsequently. 
```{r lavaan1PL, include=TRUE}
model1PSyntax = "

# loadings/discrimination parameters:
IADL =~ loading*dia1 + loading*dia2 + loading*dia3 + loading*dia4 + loading*dia5 + loading*dia6 + loading*dia7

# threshholds use the | operator and start at value 1 after t:
dia1 | t1; dia2 | t1; dia3 | t1; dia4 | t1; dia5 | t1; dia6 | t1; dia7 | t1; 

# factor mean:
IADL ~ 0;

# factor variance:
IADL ~~ 1*IADL

"

model1PEstimates = sem(model = model1PSyntax, data = iadlData, ordered = c("dia1", "dia2", "dia3", "dia4", "dia5", "dia6", "dia7"),
                       mimic = "Mplus", estimator = "WLSMV", std.lv = TRUE, parameterization = "theta")
summary(model1PEstimates, fit.measures = TRUE, rsquare = TRUE, standardized = TRUE)
resid(object = model1PEstimates)
modificationindices(model1PEstimates, sort. = TRUE)
```

To compare models, we use the `anova()` function, which uses the correct test for us. Here we see the 2PL is preferred to the 1PL, again.

```{r modelcompare, include=TRUE}
anova(model1PEstimates, model2PEstimates)
```