🚀 Overview

This repository contains the dataset and analysis from Almasi & Kristensen-McLachlan (2025):

Item	Location	Documentation
📦 Text Dataset (`v3.0`)	`data/v3.0_dataset.csv`	`data/README.md`
📦 Metrics Dataset (`v3.0`)	`metrics/*.csv`	`metrics/README.md`
🧪 Analysis	`src/`	`src/README.md`
📊 Plots & Results	`plots/` & `results/`

Teacher-student dialogue simulations were performed in a separate repository:

Item	Location	Documentation
🛠️ Generation of Dialogues	`Interact-LLM repo (src/scripts/alignment-drift)`	`README.md`

Note: The prefix v3.0 for the data refers to the prompt version used to simulate the dialogues. See the prompts in the Interact-LLM repo.

🛠️ Technical Requirements

The code was run on Python 3.12.3 on both a macOS (15.3.1) and Ubuntu system (24.04). The project also requires:

Tool	Installation
make	Installed via Homebrew
uv	Installed through this project's `makefile` (see Usage)
R 4.4.3 + R Markdown	Installed separately via CRAN for R and Posit's RStudio for running R-Markdown (or an IDE of your liking).

⚙️ Usage

You can run the code using the makefile by entering the following command in the terminal:

make run-project

This command installs uv on macOS/Linux, sets up a virtual environment with the required Python dependencies, and finally runs the code.

If you prefer to run your own installation of uv (or already have it installed), you can run only the code directly:

make run-code

Note: This does not execute stats.rmd. It must be run seperately (requires R and R Markdown, see Technical Requirements).

📝 Citation

If you use our work, please cite:

@article{almasi2025alignmentdriftcefrpromptedllms,
  title={Alignment Drift in CEFR-prompted LLMs for Interactive Spanish Tutoring}, 
  author={Mina Almasi and Ross Deans Kristensen-McLachlan},
  journal={arXiv preprint arXiv:2505.08351},
  year={2025},
  url={https://arxiv.org/abs/2505.08351},
  note={cs.CL}
}

Note: This paper has been accepted to the ACL workshop BEA2025 (20th Workshop on Innovative Use of NLP for Building Educational Applications). The final version, appearing in the ACL Anthology, is forthcoming.

✨ Acknowledgements

This work was made possible thanks to the following open-source resources:

textstat for Spanish Readability Metrics
textdescriptives for Text Length & Mean Dependency Distance
minicons & EuroBERT for LLM-based Message Surprisal

Name		Name	Last commit message	Last commit date
Latest commit History 145 Commits
data		data
metrics		metrics
plots/v3.0		plots/v3.0
results/v3.0/mixed_effects		results/v3.0/mixed_effects
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
makefile		makefile
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🚀 Overview

🛠️ Technical Requirements

⚙️ Usage

📝 Citation

✨ Acknowledgements

About

Uh oh!

Releases

Packages

Languages

License

INTERACT-LLM/alignment-drift-llms

Folders and files

Latest commit

History

Repository files navigation

🚀 Overview

🛠️ Technical Requirements

⚙️ Usage

📝 Citation

✨ Acknowledgements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages