Boundary-conditioned character task (len 20): position 0 holds rule digit k; one [B] appears uniformly in [2, T-3]; other tokens are random letters/digits. Output copies tokens through [B], then transforms: digits add k mod 10, letters flip case. Loss/metrics cover all non-[PAD] tokens.
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
python train.py --config config/default.yamlYou can override defaults, e.g. fewer steps for a smoke run:
python train.py --max_steps 200 --device cpuTraining logs per-step loss/acc and per-epoch train/val metrics (overall, identity-region, transform-region, digit vs letter splits). After training, attention heatmaps and MLP logit-shift heatmaps for one validation sample save to /tmp/{experiment_name}_attn.png and /tmp/{experiment_name}_mlp_logits.png.
Swap configs for ablations:
- No positional encoding:
python train.py --config config/ablation_no_pos.yaml - Attention zeroed:
python train.py --config config/ablation_no_attention.yaml - No MLP:
python train.py --config config/ablation_no_mlp.yaml - No residuals:
python train.py --config config/ablation_no_residual.yaml - Frozen embeddings:
python train.py --config config/ablation_frozen_embeddings.yaml
Defaults are in config/default.yaml; CLI flags --config, --max_steps, --device override for quick runs. Post-training, a few synthetic samples print for sanity checks.