This repository contains the dataset and analysis from Almasi & Kristensen-McLachlan (2025):
| Item | Location | Documentation |
|---|---|---|
📦 Text Dataset (v3.0) |
data/v3.0_dataset.csv |
data/README.md |
📦 Metrics Dataset (v3.0) |
metrics/*.csv |
metrics/README.md |
| 🧪 Analysis | src/ |
src/README.md |
| 📊 Plots & Results | plots/ & results/ |
Teacher-student dialogue simulations were performed in a separate repository:
| Item | Location | Documentation |
|---|---|---|
| 🛠️ Generation of Dialogues | Interact-LLM repo (src/scripts/alignment-drift) |
README.md |
Note: The prefix
v3.0for the data refers to the prompt version used to simulate the dialogues. See the prompts in the Interact-LLM repo.
The code was run on Python 3.12.3 on both a macOS (15.3.1) and Ubuntu system (24.04). The project also requires:
| Tool | Installation |
|---|---|
| make | Installed via Homebrew |
| uv | Installed through this project's makefile (see Usage) |
| R 4.4.3 + R Markdown | Installed separately via CRAN for R and Posit's RStudio for running R-Markdown (or an IDE of your liking). |
You can run the code using the makefile by entering the following command in the terminal:
make run-projectThis command installs uv on macOS/Linux, sets up a virtual environment with the required Python dependencies, and finally runs the code.
If you prefer to run your own installation of uv (or already have it installed), you can run only the code directly:
make run-codeNote: This does not execute
stats.rmd. It must be run seperately (requires R and R Markdown, see Technical Requirements).
If you use our work, please cite:
@article{almasi2025alignmentdriftcefrpromptedllms,
title={Alignment Drift in CEFR-prompted LLMs for Interactive Spanish Tutoring},
author={Mina Almasi and Ross Deans Kristensen-McLachlan},
journal={arXiv preprint arXiv:2505.08351},
year={2025},
url={https://arxiv.org/abs/2505.08351},
note={cs.CL}
}
Note: This paper has been accepted to the ACL workshop BEA2025 (20th Workshop on Innovative Use of NLP for Building Educational Applications). The final version, appearing in the ACL Anthology, is forthcoming.
This work was made possible thanks to the following open-source resources:
- textstat for Spanish Readability Metrics
- textdescriptives for Text Length & Mean Dependency Distance
- minicons & EuroBERT for LLM-based Message Surprisal
See also metrics/README.md.
