AI-Enhanced Psychometrics with R and Python Examples

Logo generated by ChatGPT using the prompt — “Generate a logo for AI + Psychometrics”

Overview

In the presentation at the Texas Universities Educational Statistics & Psychometrics (TUESAP) at Dallas, TX, Dr. Hong Jiao provided a fascinating talk about Computational Psychometric, a interdisciplinary area combining AI and psychometrics.

This blog aims to review the utilities of large language models in psychometrics with the following questions:

What is “computational psychometrics”?
What are applications of AI in educational psychometrics?

Data is the new oil for training large AI models. However, the “oil” generated by humans may run out someday or grow much slower than the speed of AI consuming them. Moreover, the human-created data are less controllable in terms of quality, opinions, format, style, etc., and may lead to biases or privacy concerns when used for model training (Zhou, 2024).

AI training needs human data but in a controlled way (Zhou, 2024).

Zhou, T. (2024, October 2). AERA division d webinar series. https://talks.cs.umd.edu/talks/3988

1 Computational Psychometrics

According to Davier et al. (2021), Computational Psychometrics provides “a new framework to re-conceptualize assessment theory and practices in the era of digital assessment with the advances in machine learning, natural language processing, and generative AI”. As shown in Table 1, there are many AI-enhanced applications in psychometric research, including ML Statistics, Text Data analysis, Generative AI for Data Generation etc.

Davier, A. A. von, DiCerbo, K., & Verhagen, J. (2021). Computational Psychometrics: A Framework for Estimating Learners’ Knowledge, Skills and Abilities from Learning and Assessments Systems (A. A. von Davier, R. J. Mislevy, & J. Hao, Eds.; pp. 25–43). Springer International Publishing. https://doi.org/10.1007/978-3-030-74394-9_3

Table 1: AI applications in Educational Psychometrics

AI Areas	Type	Application
Machine Learning Algorithm	Supervised Learning	Prediction, Classification
	Unsupervised Learning	Clustering, Association, Dimensionality Reduction
	Reinforcement Learning
Natural Language Processing	Language Models	Text generation, Text summarization
	Semantic Analysis	Text theme extraction, Text classification, Text understanding
	Text data analysis	Text processing, Item parameters prediction, Item quality check
Generative AI	AI Agent	Data generation and augmentation: Missing data imputation, Item development and generation, item review, Automated scoring,
	Large Language Models	Trained LLMs for psychometric tasks: Cheating detection

AI Agents in generative AI raises more attentions in Education because of the popularity and success of AI chat bots such as ChatGPT, Claude, Gemini. AI agents utilize many AI engineering techniques such as retrieval-augmented generation (RAG) and prompt engineering to enhancing the accuracy and reliability of output of large language models with information fetched from specific and relevant data sources(Merritt, 2025). Some projects are based on Maryland Assessment Research Center (MARC).

Merritt, R. (2025). What is retrieval-augmented generation aka RAG? https://blogs.nvidia.com/blog/what-is-retrieval-augmented-generation/

Topic	Research Question
Avoid misuse of AI	Detect AI generated essays or homework assignments completed by generative AI
Understand AI Behaviors	In automated scoring, compare human and AI rationale for automated scoring to safeguard human ratings with AI. Does AI think similarly like human raters in automated scoring?
AI-based Data Augmentation

2 Large Language Models

Jeff Dean recently give a pretension about the history of LLM.

Some key points are relevant to the development of LLMs:

Neural networks and backpropagation are key building block.
2012: Train a very large neural network using 16,000 CPUs.
Distributed training data/models on multiple computers.
Word2Vec makes the relationships between words as quantified vectors in high dimensional spaces.
2015: The development of TPU (Tensor Processing Unit) chips for training large models.
TensorFlow, PyTorch, jax
Attention: save all internal status and attend to certain status
2022: “Thinking longer” => show its work
Distillation: “Teacher” model provides probability distribution is better.

2.1 Are large models better than small models?

Many traditional social science models are small models. The question is whether large models are better than small models. Here, let’s say large models are the models with more than $10^5$ estimated parameters.

Large models always assume there are large amount of data that can be used to train the model.

The answer is “yes” and “no”. It depends on the task and the data.

Citation

BibTeX citation:

@online{zhang2025,
  author = {Zhang, Jihong},
  title = {AI-Enhanced {Psychometrics} with {R} and {Python} {Examples}},
  date = {2025-03-07},
  url = {https://www.jihongzhang.org/posts/2025/2025-04-03-Large-Language-Model-Psychometrics/},
  langid = {en}
}

For attribution, please cite this work as:

Zhang, J. (2025, March 7). AI-Enhanced Psychometrics with R and Python Examples. https://www.jihongzhang.org/posts/2025/2025-04-03-Large-Language-Model-Psychometrics/