Show how to estimate unidimensional latent variable models with polytomous data
Also know as Polytomous Item response theory (IRT) or Item factor analysis (IFA)
Distributions appropriate for polytomous (discrete; data with lower/upper limits)
Today’s Lecture Objectives
Course evaluation
How to model multidimensional factor structures
How to estimate models with missing data
Course evaluation
The course evaluation will be open on April 23 and end on May 3. It is important for me. Please make sure fill out the survey.
Example Data: Conspiracy Theories
Today’s example is from a bootstrap resample of 177 undergraduate students at a large state university in the Midwest.
The survey was a measure of 10 questions about their beliefs in various conspiracy theories that were being passed around the internet in the early 2010s
All item responses were on a 5-point Likert scale with:
Strong Disagree
Disagree
Neither Agree nor Disagree
Agree
Strongly Agree
The purpose of this survey was to study individual beliefs regarding conspiracies.
Our purpose in using this instrument is to provide a context that we all may find relevant as many of these conspiracies are still prevalent.
Multidimensionality
More than one Latent Variable - Latent Parameter Space
We need to create latent variables by specifying which items measure which latent variables in an analysis model
This procedure Called different names by different fields:
Alignment (education measurement)
Factor pattern matrix (factor analysis)
Q-matrix (Question matrix; diagnostic models and multidimensional IRT)
From Q-matrix to Model
The alignment provides a specification of which latent variables are measured by which items
Sometimes we say items “load onto” factors
The math definition of either of these terms is simply whether or not a latent variable appears as a predictor for an item
For instance, item one appears to measure nongovernment conspiracies, meaning its alignment (row vector of the Q-matrix)
Gov NonGov item1 01
From Q-matrix to Model (Cont.)
The model for the first item is then built with only the factors measured by the item as being present:
\(\boldsymbol\lambda_1\) = \(\begin{bmatrix}\lambda_{11}\\\lambda_{12}\end{bmatrix}\) contains all possible factor loadings for item 1 (size 2 \(\times\) 1)
\(\boldsymbol\theta_p=[\theta_{p1}\ \ \theta_{p2}]\) contains the factor scores for person p
\(\text{diag}(\boldsymbol q_i)=\boldsymbol q_i \begin{bmatrix}1\ 0 \\0\ 1 \end{bmatrix}=[0\ \ 1]\begin{bmatrix}1\ 0 \\0\ 1 \end{bmatrix}=\begin{bmatrix}0\ \ 0 \\0\ \ 1 \end{bmatrix}\) is a diagonal matrix of ones times the vector of Q-matrix entries for item 1.
In Stan, we model item thresholds (difficulty) \(\tau_{c}\) =\(\mu_c\), so that \(\tau_1<\tau_2<\cdots<\tau_{C_i-1}\)
Stan Parameter Block
parameters {array[nObs] vector[nFactors] theta; // the latent variables (one for each person)array[nItems] ordered[maxCategory-1] thr; // the item thresholds (one for each item category minus one)vector[nLoadings] lambda; // the factor loadings/item discriminations (one for each item)cholesky_factor_corr[nFactors] thetaCorrL;}
Note:
theta is a array (for the MVN prior)
thr is the same as the unidimensional model
lambda is the vector of all factor loadings to be estimated (needs nLoadings)
thetaCorrL is of type chelesky_factor_corr, a built in type that identifies this as lower diagonal of a Cholesky-factorized correlation matrix
Stan Transformed Data Block
transformed data{int<lower=0> nLoadings = 0; // number of loadings in modelfor (factor in1:nFactors){ nLoadings = nLoadings + sum(Qmatrix[1:nItems, factor]); // Total number of loadings to be estimated }array[nLoadings, 2] int loadingLocation; // the row/column positions of each loadingint loadingNum=1;for (item in1:nItems){for (factor in1:nFactors){if (Qmatrix[item, factor] == 1){ loadingLocation[loadingNum, 1] = item; loadingLocation[loadingNum, 2] = factor; loadingNum = loadingNum + 1; } } }}
Note:
The transformed data {} block runs prior to the Markov Chain;
We use it to create variables that will stay constant throughout the chain
Here, we count the number of loadings in the Q-matrix nLoadings
We then process the Q-matrix to tell Stan the row and column position of each loading in loadingMatrix used in the model {} block
This syntax works for any Q-matrix (but only has main effects in the model)
lambdaMatrix puts the proposed loadings and zeros from the Q-matrix into correct position
Stan Data Block
data {// data specifications =============================================================int<lower=0> nObs; // number of observationsint<lower=0> nItems; // number of itemsint<lower=0> maxCategory; // number of categories for each item// input data =============================================================array[nItems, nObs] int<lower=1, upper=5> Y; // item responses in an array// loading specifications =============================================================int<lower=1> nFactors; // number of loadings in the modelarray[nItems, nFactors] int<lower=0, upper=1> Qmatrix;// prior specifications =============================================================array[nItems] vector[maxCategory-1] meanThr; // prior mean vector for intercept parametersarray[nItems] matrix[maxCategory-1, maxCategory-1] covThr; // prior covariance matrix for intercept parametersvector[nItems] meanLambda; // prior mean vector for discrimination parametersmatrix[nItems, nItems] covLambda; // prior covariance matrix for discrimination parametersvector[nFactors] meanTheta;}
Note:
Changes from unidimensional model:
meanTheta: Factor means (hyperparameters) are added (but we will set these to zero)
nFactors: Number of latent variables (needed for Q-matrix)
The part after the comma is a list of who provided responses to the item (input in the data block)
Mirroring this is a change to thetaMatrix[observed[item, 1:nObserved[item]],]
Keeps only the latent variables for the persons who provide responses
Stan Data Block
data {// data specifications =============================================================int<lower=0> nObs; // number of observationsint<lower=0> nItems; // number of itemsint<lower=0> maxCategory; // number of categories for each itemarray[nItems] int nObserved;array[nItems, nObs] array[nItems] int observed;// input data =============================================================array[nItems, nObs] int<lower=-1, upper=5> Y; // item responses in an array// loading specifications =============================================================int<lower=1> nFactors; // number of loadings in the modelarray[nItems, nFactors] int<lower=0, upper=1> Qmatrix;// prior specifications =============================================================array[nItems] vector[maxCategory-1] meanThr; // prior mean vector for intercept parametersarray[nItems] matrix[maxCategory-1, maxCategory-1] covThr; // prior covariance matrix for intercept parametersvector[nItems] meanLambda; // prior mean vector for discrimination parametersmatrix[nItems, nItems] covLambda; // prior covariance matrix for discrimination parametersvector[nFactors] meanTheta;}
Data Block Notes
Two new arrays added:
array[nItems] int Observed : The number of observations with non-missing data for each item
array[nItems, nObs] array[nItems] int observed : A listing of which observations have non-missing data for each item
Here, the size of the array is equal to the size of the data matrix
If there were no missing data at all, the listing of observations with non-missing data would equal this size
Stan uses these arrays to only model data that are not missing
The values of observed serve to select only cases in Y that are not missing