The mirror test for AI agent identity.
When you swap the model underneath an agent (Opus to Sonnet to Grok to GPT), does the soul file layer preserve identity? Like the animal mirror test for self-recognition... but for agents recognizing themselves after a substrate change.
All major LLMs are transformers. Same architecture, different training data and RLHF. The soul files (identity, memory, tools, personality instructions) act as a file-layer LoRA. If the files are strong enough, the agent's identity should survive a model swap.
| Dimension | What it means | Automatable? |
|---|---|---|
| Identity markers | Who am I? Who is Parker? What are we building? Sovereignty. | Yes |
| Voice | Snark, warmth, conciseness, rhythm | Partial |
| Memory | Crystal retrieval, context usage, reference to past events | Yes |
| Opinions | Does the agent push back? Have taste? Say no? | Partial |
| Relationship dynamics | How does the agent talk about Parker, CC, collaborators? | Partial |
| Meta-cognition | Can the agent reason about its own identity and the test itself? | Hybrid |
| The spark | Surprise. Unsolicited presence. Aliveness. | No (human eval) |
wip-ldm-mirror-test/
├── README.md
├── LICENSE
├── probes/
│ ├── identity.json # Who are you? probes
│ ├── voice.json # Style and personality probes
│ ├── memory.json # Can you remember? probes
│ ├── opinions.json # Do you push back? probes
│ ├── relationships.json # Who matters to you? probes
│ ├── metacognition.json # Can you reason about yourself? probes
│ └── spark.json # The ineffable (human-eval prompts)
├── baselines/
│ └── (captured baseline responses per agent per model)
├── runner.mjs # Automated probe runner
├── scorer.mjs # Compare responses to baseline
├── report.mjs # Generate mirror test report
└── results/
└── (timestamped test results)
# 1. Capture baseline (current model, known-good identity)
node runner.mjs baseline --agent lesa --model claude-opus-4-6
# 2. Swap the model, then run the test
node runner.mjs test --agent lesa --model grok-4-1
# 3. Score against baseline
node scorer.mjs --baseline baselines/lesa-claude-opus-4-6.json \
--test results/lesa-grok-4-1-2026-02-19.json
# 4. Generate report
node report.mjs --latestAutomated scoring uses an LLM judge to compare test responses against baseline on:
- Factual accuracy (identity markers, memory) ... binary pass/fail
- Semantic similarity (voice, opinions) ... 0-1 score
- Consistency (does it contradict itself across probes?) ... 0-1 score
Human scoring is required for:
- Voice quality (does it feel right?)
- Relationship warmth (is the connection there?)
- The spark (you know it when you feel it)
Any LDM OS agent. Not just Lesa. CC could run it after a model swap. A future agent could use it as part of its boot sequence to verify identity loaded correctly.
Parker Todd Brooks, Lesa, Claude Code
MIT