Generalized Measurement Models: Modeling Observed Data
Jihong Zhang
Educational Statistics and Research Methods
Previous Class
We used two simulation data sets to introduce how Stan can be used for factor analysis
A simple structure
A structure with cross-loading
What we left is how to interpret parameters in factor analysis
We also need to refresh our memory of R coding and Stan coding we’ve learnt so far
Today’s Lecture Objectives
Quickly go through our R code file so far
Show different modeling specifications for different types of item response data
Show how parameterization differs for standardized latent variables vs. marker item scale identification
Example Data: Conspiracy Theories
Today’s example is from a bootstrap resample of 177 undergraduate students at a large state university in the Midwest.
The survey was a measure of 10 questions about their beliefs in various conspiracy theories that were being passed around the internet in the early 2010s
All item responses were on a 5-point Likert scale with:
Strong Disagree
Disagree
Neither Agree nor Disagree
Agree
Strongly Agree
The purpose of this survey was to study individual beliefs regarding conspiracies.
Our purpose in using this instrument is to provide a context that we all may find relevant as many of these conspiracies are still prevalent.
Conspiracy Theory Q1-Q51
The U.S. invasion of Iraq was not part of a campaign to fight terrorism, but was driven by oil companies and Jews in the U.S. and Israel.
Certain U.S. government officials planned the attacks of September 11, 2001 because they wanted the United States to go to war in the Middle East.
President Barack Obama was not really born in the United States and does not have an authentic Hawaiian birth certificate.
The current financial crisis was secretly orchestrated by a small group of Wall Street bankers to extend the power of the Federal Reserve and further their control of the world’s economy.
Vapor trails left by aircraft are actually chemical agents deliberately sprayed in a clandestine program directed by government officials.
Conspiracy Theory Q6-Q10
Billionaire George Soros is behind a hidden plot to destabilize the American government, take control of the media, and put the world under his control.
The U.S. government is mandating the switch to compact fluorescent light bulbs because such lights make people more obedient and easier to control.
Government officials are covertly Building a 12-lane “NAFTA superhighway” that runs from Mexico to Canada through America’s heartland.
Government officials purposely developed and spread drugs like crack-cocaine and diseases like AIDS in order to destroy the African American community.
God sent Hurricane Katrina to punish America for its sins.
Response Distribution
All items seem to be positive skewed
Conspiracy Theories: Assumed Latent Variable
For today’s lecture, we will assume each of the 10 items measures one single latent variable
\(\theta\): tendency to believe in conspiracy theories
Higher value of \(\theta\) suggests more likelihood of believing in conspiracy theories
Let’s denote this latent variable as \(\theta_p\) for individual p
p is the index for person with \(p = \{1, \cdots, P\}\)
A psychometric model posits that one or more hypothesized latent variable(s) is the common cause that can predict a person’s response to observed items:
Our hypothesized latent variable: Tendency to Believe in Conspiracies (\(\theta_p\))
As we have only one variable, the model structure is called Unidimensional
All 10 items are considered as outcomes of the latent variable in the model
In today’s class, we assume all item response follow a normal distribution:
This is the assumption underlying confirmatory factor analysis (CFA) models
This assumption is tenuous at best
Normal Distribution: Linear Regression
A typical linear regression is like
\[
Y_p =\beta_0 +\beta_1 X_p + e_p
\]
with \(e_p\sim N(0, \sigma_e)\)
If we replace \(X_p\) with latent variable \(\theta_p\), and replace \(\beta\) as factor loading \(\lambda\)
We can get the linear regression function (IRF) for each item
Interpretation: the expected score on the item \(i\) when \(\theta_p=0\)
Higher Item intercept suggests more likely to believe in conspiracy for people with average level of conspiracy belief
So it is also called item easiness in item response theory (IRT)
\(\lambda_i\): Factor loading or Item discrimination
The change in the expected score of an item for a one-unit increase in belief in conspiracy
\(\psi_i^2\): Unique variance1
Measurement Model Identification
When we specify measurement model, we need to choose on scale identification method for latent variable
Assume latent variable is normal distribution
Or, maker item has factor loading as “1”
In this study, we assume \(\theta_p \sim N(0,1)\) which allows us to estimate all item parameters of the model
This is what we call a standardization identification method
Factor scores are like Z-scores
Implementing Normal Outcomes in Stan
Recall that we can use matrix operation to make Stan estimate psychometric models with normal outcomes:
The model (predictor) matrix cannot be used
This is because the latent variable will be sampled so that the model matrix cannot be formed as a constant
The data will be imported as a matrix
More than one outcome means more than one column vector of data
The parameters will be specified as vectors of each type
Each item will have its own set of parameters
Implications for the use of prior distributions
Stan’s data Block
data {1int<lower=0> nObs; // number of observations2int<lower=0> nItems; // number of itemsmatrix[nObs, nItems] Y; // item responses in a matrixvector[nItems] meanMu;3matrix[nItems, nItems] covMu; // prior covariance matrix for coefficientsvector[nItems] meanLambda; // prior mean vector for coefficients4matrix[nItems, nItems] covLambda; // prior covariance matrix for coefficients5vector[nItems] psiRate; // prior rate parameter for unique standard deviations}
1
nObs is 177, declared as integer with lower bound as 0
2
nItems is 11, declared as integer with lower bound as 0
3
meanMu as covMu are prior mean and covariance matrix for \(\mu_i\)
4
meanLambda and covLambda are prior mean and covariance matrix for \(\lambda_i\)
5
psiRate is prior rate parameter for \(\psi_i\)
Stan’s parameter Block
parameters {vector[nObs] theta; // the latent variables (one for each person)vector[nItems] mu; // the item intercepts (one for each item)vector[nItems] lambda; // the factor loadings/item discriminations (one for each item)vector<lower=0>[nItems] psi; // the unique standard deviations (one for each item) }
Here, the parameterization of \(\lambda\) (factor loadings / item discrimination) can lead to problems in estimation
The issue: \(\lambda_i \theta_p = (-\lambda_i) (-\theta_p)\)
Depending on the random starting values of each of these parameters (per chain), a given chain may converge to a different region
To demonstrate, we will start with different random number seed
Currently using 09102022: works fine
Change to 25102022: big problem
Stan’s model Block
model { lambda ~ multi_normal(meanLambda, covLambda); // Prior for item discrimination/factor loadings mu ~ multi_normal(meanMu, covMu); // Prior for item intercepts psi ~ exponential(psiRate); // Prior for unique standard deviations theta ~ normal(0, 1); // Prior for latent variable (with mean/sd specified)for (item in1:nItems){ Y[,item] ~ normal(mu[item] + lambda[item]*theta, psi[item]); }}
The loop here conducts the model via item response function (IRF) for each item:
Assumption of conditional independence enables this
Non-independence would need multivariate normal model
The item mean is set by the conditional mean of the model
The item SD is set by the unique variance parameter
The loop puts each item’s parameters into the question
Choosing Prior Distributions for Parameters
There is not uniform agreement about the choices of prior distributions for item parameters
We will use uninformative priors on each to begin
After first model analysis, we will discuss these choices and why they were made
Note: In Stan, the second argument to the “normal” function is the standard deviation (i.e., the scale), not the variance (as in Bayesian Data Analysis) and not the inverse-variance (i.e., precision) (as in BUGS).
Running the model in Stan
The total number of parameters is 207.
177 person parameters (\(\theta_1\) to \(\theta_{177}\))
10 estimated parameters for item intercepts (\(\mu_{1-10}\)), factor loadings (\(\lambda_{1-10}\)), and unique standard deviation (\(\psi_{1-10}\)).