Introduce measurement (psychometric) models in general
Describe the steps needed in a psychometric model analysis
Dive deeper into the observed-variables modeling aspect
Measurement Model in general
Measurement Model Analysis Steps
%%{init: {"flowchart": {"htmlLabels": false}} }%%
graph TD
subgraph "Measurement Procedure"
subgraph Modeling
direction LR
Start --> id1
id1([Specify model]) --> id2(["`Specify scale identification
method for latent variables`"])
id2 --> id3([Estimate model])
id3 --> id4{Adequate fit indices}
id4 -- No --> id1
end
Modeling -- Model fit acceptable --> one
subgraph one["Evaluation: Measurement Model with Auxiliary Components"]
direction LR
id5(["Score estimation
and secondary analyses with scores"]) --> id6([Item evaluation])
id6 --> id7([Scale construction])
id7 --> id8([Equating])
id8 --> id9([Measure Invariance/Differential item functioning])
id9 --> End
end
end
style Start fill:#f9f,stroke:#333,stroke-width:5px
style End fill:#bbf,stroke:#333,stroke-width:5px
Components of a Measurement Model
There are two components of a measurement model
Theory (what we cannot see but assume its existence):
Latent variable(s)
Other effects as needed by a model
Random effects (e.g., initial status and slopes in Latent Growth Model)
Testlet effects (e.g., a group-level variation among items)
Effects of observed variables (e.g., gender differences, DIF, Measurement Invariance)
Data (what we can see and we assume generated by theory):
Outcomes
An assumed distribution for each outcome
A key statistic of outcome for the model (e.g., mean, sd)
Latent variables in Bayesian are built by following specification:
What are their distributions? (normal distribution or others)
For example, \(\theta_i\) values for one person and \(\theta\) values for samples. Factor score \(\theta\) is a mixture distribution of distributions of each individual’s factor score \(\theta_i\)
But, in MLE/WSLMV, we do not estimate mean and sd of each individual’s factor score for model to be converged
Code
library(tidyverse)set.seed(12)N =15ndraws =200FS =matrix(NA, nrow = N, ndraws)FS_p =rnorm(N)FS_p = FS_p[order(FS_p)]for (i in1:N) { FS[i,] =rnorm(ndraws, mean = FS_p[i], sd =1)}FS_plot <-as.data.frame(t(FS))colnames(FS_plot) <-paste0("Person", 1:N)FS_plot <- FS_plot |>pivot_longer(everything(), names_to ="Person", values_to ="Factor Score")FS_plot$Person <-factor(FS_plot$Person, levels =paste0("Person", 1:N))ggplot() +geom_density(aes(x =`Factor Score`, fill = Person, col = Person ), alpha = .5, data = FS_plot) +geom_density(aes(x = FS_p))
Multidimensionality
How many factors to be measured?
If \(\geq\) 2 factors, we specify mean vectors and variance-covariance matrix
To link latent variables with observed variables, we need to create a indicator matrix of coefficient/effects of latent variable on items.
In diagnostic modeling and multidimensional IRT, we call it Q-matrix
Simulation Study 1
Let’s perform a small simulation study to see how to perform factor analysis in naive Stan.
The model specification is a two-factor with each measured by 3 items. In total, there are 6 items with continuous responses. Sample size is 1000.
We will iterate over item response function across each item
To benefit from the efficiency of vectorization, we specify a vector of factor loadings with length 6
\(\{\lambda_1, \lambda_2, \cdots, \lambda_6 \}\)
Optionally, a matrix with number of items by number of factor for \(\lambda\)s can be specified like \(\lambda_{11}\) and \(\lambda_{62}\), that introduces flexibility but complexity
We should have a location table telling stan the information about which factor each factor loading belong to
For example, \(\lambda_{1}\) belongs to factor 1 and \(\lambda_4\) belong to factor 2
For eta = 1, the distribution is uniform over the space of all correlation matrices
Data Structure and Data Block
Lecture06.R
data_list <-list(N =1000, # number of subjects/observationsJ = J, # number of itemsK =2, # number of latent variables,Y = Y,Q = Q,# location/index of lambdakk = loc[,2],#hyperparametersigmaRate = .01,meanMu =rep(0, J),covMu =diag(1000, J),meanTheta =rep(0, 2),eta =1,meanLambda =rep(0, J),covLambda =diag(1000, J))
simulation_loc.stan
data {int<lower=0> N; // number of observationsint<lower=0> J; // number of itemsint<lower=0> K; // number of latent variablesmatrix[N, J] Y; // item responses//location/index of lambdaarray[J] int<lower=0> kk;//hyperparameterreal<lower=0> sigmaRate;vector[J] meanMu;matrix[J, J] covMu; // prior covariance matrix for coefficientsvector[K] meanTheta;vector[J] meanLambda;matrix[J, J] covLambda; // prior covariance matrix for coefficientsreal<lower=0> eta; // LKJ shape parameters}
Parameter and Transformed Parameter block
simulation_loc.stan
parameters {vector[J] mu; // item interceptsvector<lower=0,upper=1>[J] lambda; // factor loadingsvector<lower=0>[J] sigma; // the unique residual standard deviation for each itemmatrix[N, K] theta; // the latent variables (one for each person)cholesky_factor_corr[K] L; // L of factor correlation matrix}transformed parameters{matrix[K,K] corrTheta = multiply_lower_tri_self_transpose(L);}
Note that L is the Cholesky decomposition factor of factor correlation matrix
It is also a lower triangle matrix
use cholesky_factor_corr[K] to declare this parameter
Sometime, parameters are those that are easy to sampling, transformed parameters block is used to transform your parameters to those that are difficulty to direct sampling.
Model block
Lecture06.R
data_list <-list(N =1000, # number of subjects/observationsJ = J, # number of itemsK =2, # number of latent variables,Y = Y,Q = Q,# location/index of lambdakk = loc[,2],#hyperparametersigmaRate = .01,meanMu =rep(0, J),covMu =diag(1000, J),meanTheta =rep(0, 2),eta =1,meanLambda =rep(0, J),covLambda =diag(1000, J))
simulation_loc.stan
model { mu ~ multi_normal(meanMu, covMu); sigma ~ exponential(sigmaRate); lambda ~ multi_normal(meanLambda, covLambda); L ~ lkj_corr_cholesky(eta);for (i in1:N) { theta[i,] ~ multi_normal(meanTheta, corrTheta); }for (j in1:J) { // loop over each item response function Y[,j] ~ normal(mu[j]+lambda[j]*theta[,kk[j]], sigma[j]); }}
Note that kk[j] selects which factor to be multiplied dependent on factor loading’s index. That why we have a location matrix of factor loadings. Theta in loc table is kk.
We firt examined the max/mean PSRF (rhat) for convergence. This is also called Gelman and Rubin diagnosis. The maximum RSRF is 1.03 suggesting MCMC estimation converged.
Then, we examined the LOO with PSIS. According to Pareto K diagnostic, most log-likelihood estimation suggests good reliability.
Question: Why we have 8000 \(\times\) 6000 log-likelihood elements? hints: our information in data
To illustrate how to model a more complex CFA with cross-loadings, let’s generate a new simulation data with test length 7 and two latent variables
The mean structure of model is like this, each latent variable was measured by 4 items. The other parameters are as same as Model 1.
Mean Structure of Model 2
Strategies for Model 2
The general idea is that we assign a uninformative prior distribution for “main” factor loadings, and assign informative “close-to-zero” prior distribution for “zero” factor loadings:
Strategies for Model 2 (Cont.)
Have a more detailed location matrix for factor loading matrix
Where Item represents the index of item a factor loading belongs to, and Theta represents the index of latent variables a factor loading belongs to.
In Stan’s data block, we denote Item of location matrix as jj and Theta as kk . q represents two types of prior distributions for factor loadings.
simulation_exp2.stan
data { ...int<lower=0> R; // number of rows in location matrixarray[R] int<lower=0>jj;array[R] int<lower=0>kk;array[R] int<lower=0>q; ...}
Data Structure and Data Block
Lecture06.R
data_list2 <-list(N =1000, # number of subjects/observationsJ = J2, # number of itemsK =2, # number of latent variables,Y = Y2,Q = Q2,# location of lambdaR =nrow(loc2),jj = loc2[,1],kk = loc2[,2],q = loc2[,3],#hyperparametermeanSigma = .1,scaleSigma =1,meanMu =rep(0, J2),covMu =diag(10, J2),meanTheta =rep(0, 2),corrTheta =matrix(c(1, .3, .3, 1), 2, 2, byrow = T))
simulation_exp2.stan
data {int<lower=0> N; // number of observationsint<lower=0> J; // number of itemsint<lower=0> K; // number of latent variablesmatrix[N, J] Y; // item responsesint<lower=0> R; // number of rows in location matrixarray[R] int<lower=0>jj;array[R] int<lower=0>kk;array[R] int<lower=0>q;//hyperparameterreal<lower=0> meanSigma;real<lower=0> scaleSigma;vector[J] meanMu;matrix[J, J] covMu; // prior covariance matrix for coefficientsvector[K] meanTheta;matrix[K, K] corrTheta;}
Note that for the simplicity of estimation, I specified the factor correlation matrix as fixed. If you are interested in estimating factor correlation, you can refer to the previous model using LKJ sampling.
Parameters block
simulation_exp2.stan
parameters {vector<lower=0,upper=1>[J] mu;matrix<lower=0>[J, K] lambda;vector<lower=0,upper=1>[J] sigma; // the unique residual standard deviation for each itemmatrix[N, K] theta; // the latent variables (one for each person)//matrix[K, K] corrTheta; // not use corrmatrix to avoid duplicancy of validation}
For parameters block, the only difference is we specify lambda as a matrix with J \(\times\) K, which is 7 \(\times\) 2 in our case.
Model block
As you can see, in Model block, we need to use if_else in Stan to specify factor loadings in different locations
For type 1 (green), we specify a uninformative normal distribution
For type 2 (red), we specify a informative shrinkage priors.
For univariate residual, we can simply use cauchy or exponential prior
The item response function estimate each item response using \(\mu+ \Lambda * \Theta\) as kernal and \(\sigma\) as variation