Skip to content

jack-chaudier/mirage

Repository files navigation

The Validity Mirage

This started as a narrative simulation engine. The greedy extraction step kept failing in ways that looked random but weren't. Investigating why led to a formal theory of when and how sequential systems break under endogenous constraints — constraints whose structure depends on the solution itself — and the discovery that LLMs exhibit the same failure mode under context compression.

We call this failure mode the validity mirage: the output scores high on fluency, coherence, and format compliance while silently substituting the specific facts that determine whether the answer is actually correct. The answer looks valid but its semantic pivot has shifted.

How we got here

The four papers in this repo trace a single thread from engineering observation to formal theory to empirical validation:

# Paper What it does
0 NarrativeField: Continuous Control & Structural Regularization Documents the simulation engine that started this work — a deterministic multi-agent world (six characters, secrets, conflicting goals) with grammar-constrained story extraction. Across 3,250+ runs and 50 seeds (98% extraction validity), a systematic quality-validity tradeoff revealed that extraction failures were structural, not random.
1 Absorbing States in Greedy Search Formalizes the extraction failures. When a turning point is defined by the data itself (endogenous), greedy search can lock into absorbing states where no local improvement can reach a valid solution. Standard greedoid theory assumes exogenous constraints and misses this.
2 Streaming Oscillation Traps Extends the theory to streaming settings. Under incremental arrival, endogenous pivots create oscillation traps — the system cycles between candidate solutions without converging.
3 The Validity Mirage Connects the theory to LLMs. Context compression is a form of lossy sequential processing with endogenous structure: the model's attention pattern determines which tokens matter, but which tokens matter depends on what the model attends to. The mirage is the empirical consequence.

The practical consequence: standard evaluation pipelines — fluency, coherence, format compliance — can certify outputs as correct when they aren't. The failure is invisible to every metric except one that checks whether the specific fact the answer hinges on actually survived.

The core result

Across five instruction-tuned models, raw validity scores remain above 0.83 while pivot preservation drops as low as 0.42. The gap is the mirage.

Validity-preservation gap across 5 frontier models. Raw validity stays high while pivot preservation collapses.

Models tested: Gemma-2 9B, Llama-3.1 8B, Mistral 7B v0.3, Phi-3-Medium 14B, Qwen-2.5 14B. All bf16, greedy decoding, MirageBench 12-task set at compression levels 0.4/0.5/0.6.

KV-cache eviction

The mirage also appears at the representation level. When KV-cache entries are evicted (retaining 70% down to 10% of keys), pivot preservation drops to 8.3% at 10% retention — even though all prerequisite information remains present in the input text. This isolates the failure to internal attention, not input truncation.

KV eviction sweep on Llama-3.1 8B. Pivot preservation drops to 8.3% at 10% retention despite full input context.

Real-incident validation (NTSB)

To test whether the mirage appears on real causal structures (not just synthetic benchmarks), we built a compression benchmark from NTSB aviation incident reports. Across 180 naive-compression trials (12 incidents × 5 seeds × 3 budgets), root-cause attribution shifts in 57% of cases (103/180). Of the 164 trials where compression actually degraded the output, 22% are silent mirages (36/164) — the model confidently names the wrong cause with no indication of uncertainty. A contract-guarded compression method (which preserves the endogenous pivot structure) eliminates attribution shift entirely across all budgets.

Mirage-aware fine-tuning

A LoRA adapter (3.2M parameters, ~0.12% of the base model), trained on synthetic mirage examples, eliminates the failure mode.

Provenance note:

  • Canonical package for the table below is mirage_aware_package.tar.gz at repo root (mirage_aware_adapter_balanced/adapter_config.json: base Qwen/Qwen2.5-7B-Instruct, r=8).
  • Canonical balanced package training config is num_train_epochs=1, per_device_train_batch_size=2, gradient_accumulation_steps=4, global_step=250 (about 2,000 train examples), not a 3-epoch run.
  • This package's eval slice is 400 examples (371 degraded, 29 strong); FT silent mirage is 1/371 = 0.27% on degraded rows.
  • The MLX/Gemma adapter in endogenous_context_theory/release/adapters/mirage_aware_v1/ is a separate run lineage.
Base (Qwen 2.5 7B, balanced eval slice n=400) + Mirage-aware LoRA
Pivot accuracy (degraded inputs) 41.0% 99.2%
Silent mirage rate 59.0% 0.27%
Degradation flagging rate 0% 95.4%
False alarm rate (clean inputs) 0% 0%

Mirage-aware fine-tuning results on balanced run. LoRA improves degraded-input pivot accuracy and degradation flagging while collapsing silent mirages to near zero.

The adapter learns to both identify the correct pivot under compression and explicitly flag when context degradation may have affected its answer. Canonical Qwen package artifact: mirage_aware_package.tar.gz (extracts mirage_aware_adapter_balanced/). Separate MLX adapter artifact: endogenous_context_theory/release/adapters/mirage_aware_v1/. For full provenance mapping, see docs/mirage-source-of-truth.md.

What's in this repo

Directory Contents
papers/ Four papers (PDFs) and canonical LaTeX sources (papers/sources/)
projects/lorien/ NarrativeField — the narrative simulation engine where this started
projects/rhun/ Rhun — the domain-agnostic greedy extraction failure framework
endogenous_context_theory/src/ Tropical semiring algebra, compression, pivot-margin code
endogenous_context_theory/tests/ 18 synthetic validation experiments
endogenous_context_theory/release/ MirageBench tasks, notebooks, result CSVs, figures, LoRA adapter
endogenous_context_theory/results/ntsb/ Real-incident NTSB benchmark (external validation)

Quick start

# Setup
cd endogenous_context_theory
python3 -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt

# Run all 18 synthetic validation experiments
python scripts/run_all.py

# Rebuild release figures and summary tables
python scripts/build_release_assets.py

The blackbox and KV-cache experiments require GPU access. Open the notebooks in release/notebooks/ on Colab or a local GPU machine:

  • miragebench_blackbox_bf16_5models_colab.ipynb — reproduces the 5-model sweep
  • kv_cache_eviction_mirage_colab.ipynb — reproduces the KV retention curve

To load the mirage-aware adapter:

tar -xzf mirage_aware_package.tar.gz
# Adapter path after extract: mirage_aware_adapter_balanced/
# Base model: Qwen/Qwen2.5-7B-Instruct

Reproducibility

See endogenous_context_theory/release/README.md for the full artifact map (paper section to file), integrity checksums, and inference protocol details. See docs/reproducibility-checklist.md for the step-by-step checklist.

Paper publishing workflow:

./scripts/publish_papers_from_sources.sh

Citation

@article{gaffney2026narrativefield,
  title   = {Continuous Control and Structural Regularization in Multi-Agent Narrative Extraction},
  author  = {Jack Chaudier Gaffney},
  year    = {2026},
  journal = {Forthcoming}
}

@article{gaffney2026absorbing,
  title   = {Absorbing States in Greedy Search: When Endogenous Constraints Break Sequential Extraction},
  author  = {Jack Chaudier Gaffney},
  year    = {2026},
  journal = {Forthcoming}
}

@article{gaffney2026streaming,
  title   = {Streaming Oscillation Traps in Endogenous-Pivot Sequential Extraction},
  author  = {Jack Chaudier Gaffney},
  year    = {2026},
  journal = {Forthcoming}
}

@article{gaffney2026mirage,
  title   = {The Validity Mirage: Context Algebra for Endogenous Semantics under Memory Compression},
  author  = {Jack Chaudier Gaffney},
  year    = {2026},
  journal = {Forthcoming}
}

License

See individual directories for licensing details.

About

Detecting silent pivot substitution in LLMs under context compression

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors