AI-Enhanced Psychometrics with R and Python Examples

R
Python
LLM
ChatGPT
Author
Affiliation

Jihong Zhang

Published

March 7, 2025

Modified

April 18, 2025

Logo generated by ChatGPT using the prompt — “Generate a logo for AI + Psychometrics”

Logo generated by ChatGPT using the prompt — “Generate a logo for AI + Psychometrics”

Overview

In the presentation at the Texas Universities Educational Statistics & Psychometrics (TUESAP) at Dallas, TX, Dr. Hong Jiao provided a fascinating talk about Computational Psychometric, a interdisciplinary area combining AI and psychometrics.

This blog aims to review the utilities of large language models in psychometrics with the following questions:

  1. What is “computational psychometrics”?
  2. What are applications of AI in educational psychometrics?

Data is the new oil for training large AI models. However, the “oil” generated by humans may run out someday or grow much slower than the speed of AI consuming them. Moreover, the human-created data are less controllable in terms of quality, opinions, format, style, etc., and may lead to biases or privacy concerns when used for model training (Zhou, 2024).

AI training needs human data but in a controlled way (Zhou, 2024).

Zhou, T. (2024, October 2). AERA division d webinar series. https://talks.cs.umd.edu/talks/3988

1 Computational Psychometrics

According to Davier et al. (2021), Computational Psychometrics provides “a new framework to re-conceptualize assessment theory and practices in the era of digital assessment with the advances in machine learning, natural language processing, and generative AI”. As shown in Table 1, there are many AI-enhanced applications in psychometric research, including ML Statistics, Text Data analysis, Generative AI for Data Generation etc.

Davier, A. A. von, DiCerbo, K., & Verhagen, J. (2021). Computational Psychometrics: A Framework for Estimating Learners Knowledge, Skills and Abilities from Learning and Assessments Systems (A. A. von Davier, R. J. Mislevy, & J. Hao, Eds.; pp. 25–43). Springer International Publishing. https://doi.org/10.1007/978-3-030-74394-9_3
Table 1: AI applications in Educational Psychometrics
AI Areas Type Application
Machine Learning Algorithm Supervised Learning Prediction, Classification
Unsupervised Learning Clustering, Association, Dimensionality Reduction
Reinforcement Learning
Natural Language Processing Language Models Text generation, Text summarization
Semantic Analysis Text theme extraction, Text classification, Text understanding
Text data analysis Text processing, Item parameters prediction, Item quality check
Generative AI AI Agent Data generation and augmentation: Missing data imputation, Item development and generation, item review, Automated scoring,
Large Language Models Trained LLMs for psychometric tasks: Cheating detection

AI Agents in generative AI raises more attentions in Education because of the popularity and success of AI chat bots such as ChatGPT, Claude, Gemini. AI agents utilize many AI engineering techniques such as retrieval-augmented generation (RAG) and prompt engineering to enhancing the accuracy and reliability of output of large language models with information fetched from specific and relevant data sources(Merritt, 2025). Some projects are based on Maryland Assessment Research Center (MARC).

Merritt, R. (2025). What is retrieval-augmented generation aka RAG? https://blogs.nvidia.com/blog/what-is-retrieval-augmented-generation/
Topic Research Question
Avoid misuse of AI
  • Detect AI generated essays or homework assignments completed by generative AI
Understand AI Behaviors
  • In automated scoring, compare human and AI rationale for automated scoring to safeguard human ratings with AI.

  • Does AI think similarly like human raters in automated scoring?

AI-based Data Augmentation

2 Large Language Models

Jeff Dean recently give a pretension about the history of LLM.

Some key points are relevant to the development of LLMs:

  1. Neural networks and backpropagation are key building block.

  2. 2012: Train a very large neural network using 16,000 CPUs.

  3. Distributed training data/models on multiple computers.

  4. Word2Vec makes the relationships between words as quantified vectors in high dimensional spaces.

  5. 2015: The development of TPU (Tensor Processing Unit) chips for training large models.

  6. TensorFlow, PyTorch, jax

  7. Attention: save all internal status and attend to certain status

  8. 2022: “Thinking longer” => show its work

  9. Distillation: “Teacher” model provides probability distribution is better.

2.1 Are large models better than small models?

Many traditional social science models are small models. The question is whether large models are better than small models. Here, let’s say large models are the models with more than 10^5 estimated parameters.

Large models always assume there are large amount of data that can be used to train the model.

The answer is “yes” and “no”. It depends on the task and the data.

Back to top

Citation

BibTeX citation:
@online{zhang2025,
  author = {Zhang, Jihong},
  title = {AI-Enhanced {Psychometrics} with {R} and {Python} {Examples}},
  date = {2025-03-07},
  url = {https://www.jihongzhang.org/posts/2025/2025-04-03-Large-Language-Model-Psychometrics/},
  langid = {en}
}
For attribution, please cite this work as:
Zhang, J. (2025, March 7). AI-Enhanced Psychometrics with R and Python Examples. https://www.jihongzhang.org/posts/2025/2025-04-03-Large-Language-Model-Psychometrics/