Recursive Language Model skill for AI coding agents — programmatic codebase exploration via persistent Python REPL
Ever hit a wall exploring large codebases with AI?
That's what happens when your tools can only read one file at a time.
Installation • Usage • The Problem • How It Works • Examples
relamo implements the Recursive Language Model (RLM) pattern as an Agent Skills standard skill, supported across all major AI coding agents. Instead of stuffing files into prompts, it concatenates your codebase into a Python variable and lets the agent write code to search, extract, and analyze it iteratively — with full state persistence across REPL iterations.
relamo uses the Agent Skills open standard (SKILL.md format), supported across all major AI coding agents.
| Platform | How to install |
|---|---|
| Claude Code | claude plugin marketplace add ph3on1x/relamoclaude plugin install relamo |
| Gemini CLI | gemini extensions install <github-url> |
| Codex CLI | Clone the repo, then run ./scripts/setup-platforms.sh |
| Cursor | Auto-discovers skills — no setup needed if Claude Code plugin is installed. Otherwise, run ./scripts/setup-platforms.sh |
Note
Requires Python 3.11+ (managed automatically by uv) and uv (auto-installs the dill dependency). llm_query() and recursive_llm() auto-detect the first available CLI in PATH (claude, gemini, codex). Override with RELAMO_LLM_CMD env var.
| You're thinking... | Use | What happens |
|---|---|---|
| "How does auth work in this 200-file project?" | /relamo "how does auth work?" |
Gathers codebase, explores iteratively via REPL, returns structured answer with evidence |
| "Find all API endpoints and their handlers" | /relamo "find all API endpoints" |
Regex searches, extracts files, maps routes to handlers across the entire codebase |
| "Compare error handling patterns across modules" | /relamo "compare error handling" |
Batch-processes files, uses llm_query() for sub-analysis, synthesizes findings |
| "I need to analyze just one subdirectory" | /relamo "analyze auth" --context src/auth |
Scopes the REPL context to just that directory |
| Argument | Default | Description |
|---|---|---|
<query> |
required | The question or task to answer |
--context <path> |
current directory | Path to codebase directory or single file |
--depth <1-3> |
1 | Max recursion depth for recursive_llm() |
--iterations <max> |
15 | Max REPL loop iterations |
AI coding agents are great at reading individual files. But when you need to understand how an entire codebase fits together:
- Context window limits — large codebases don't fit in a single prompt
- No state between tool calls — each file read starts from scratch
- No batch processing — you can't programmatically map an operation across 50 files
flowchart TD
A["/relamo 'find all auth flows'"] --> B["Init: gather codebase\ninto context variable"]
B --> C["Assess: what do I know?\nWhat do I need to find out?"]
C --> D["Write Python code\ntargeting the context variable"]
D --> E["Execute in sandboxed REPL\n(variables persist!)"]
E --> F{"Need more\ninfo?"}
F -- Yes --> C
F -- No --> G["FINAL(answer)"]
style A fill:#1a1a2e,stroke:#e94560,color:#fff
style G fill:#1a1a2e,stroke:#0f3460,color:#fff
style F fill:#16213e,stroke:#e94560,color:#fff
The entire codebase lives outside the prompt as a Python string. The agent writes code to interact with it — search(), extract_file(), llm_query() — accumulating findings in variables across iterations. This is the RLM pattern brought to AI coding agents as a skill.
The REPL engine (scripts/repl.py) is a uv single-file script with PEP 723 inline metadata. No manual dependency installation needed — uv run handles everything.
- Init — gathers your codebase via
git ls-files(or directory walk), skips binaries and large files, concatenates everything with=== path ===delimiters - Execute — runs Python code in a namespace where
contextand helpers are pre-loaded; state persists via dill serialization - Loop — the agent assesses, writes code, executes, reads output, and decides whether to continue or call
FINAL()
| Function | Purpose |
|---|---|
context | Full concatenated codebase as string |
list_files() | All file paths in context |
extract_file(path) | Extract single file content by path |
search(pattern, context_chars=200) | Regex search with surrounding context |
llm_query(prompt) | LLM completion via auto-detected CLI (claude, gemini, or codex) |
llm_query_batched(prompts) | Sequential LLM calls on a list of prompts |
recursive_llm(query, sub_context) | Spawn child RLM instance via auto-detected CLI |
FINAL(answer) | Emit final answer and terminate |
FINAL_VAR(var_name) | Emit a variable as the answer |
config | Mutable safety config dict |
> /relamo "how does authentication work in this project?"
[relamo] Gathering codebase from: /Users/you/project
[relamo] Context size: 245,891 characters (241,003 bytes)
[relamo] Files included: 87
--- Iteration 1: Orient ---
files = list_files()
auth_files = [f for f in files if 'auth' in f.lower()]
print(auth_files)
# ['src/auth/middleware.ts', 'src/auth/providers.ts', 'src/auth/session.ts', ...]
--- Iteration 2: Extract key files ---
middleware = extract_file('src/auth/middleware.ts')
print(middleware[:2000])
--- Iteration 3: Analyze with LLM ---
analysis = llm_query(f"Explain the auth flow in this middleware:\n{middleware}")
print(analysis)
--- Iteration 4: Search for usage ---
results = search(r'requireAuth|isAuthenticated|withAuth')
print(f"Found {len(results)} usages across codebase")
--- Iteration 5: Synthesize ---
FINAL(f"Authentication uses JWT tokens via {analysis}...")
## RLM Result
### Answer
Authentication uses JWT tokens with a middleware chain...
### Evidence
src/auth/middleware.ts:15 — token validation
src/auth/providers.ts:42 — OAuth provider config
...
> /relamo "give me a high-level architecture overview of this project"
The REPL lists all files, groups them by directory, identifies key entry points,
extracts package.json/config files, and uses llm_query() to summarize each layer.
Returns a structured overview with the tech stack, data flow, and key patterns.
> /relamo "find all TODO and FIXME comments, categorize by priority and module"
The REPL searches for TODO/FIXME patterns across every file, extracts surrounding
context, uses llm_query() to classify each by priority, and returns a sorted report
grouped by module — something that would take dozens of manual Grep calls.
> /relamo "how does data flow from API request to database write?" --depth 2
The REPL identifies the API layer, then spawns recursive_llm() sub-instances to
independently analyze the routing layer, validation layer, and database layer.
Each child REPL gets a focused subset of the codebase. Results are merged into
a complete data flow analysis with evidence from each layer.
| Parameter | Default | Description |
|---|---|---|
recursion_limit | 1 | Max depth for recursive_llm() |
max_iterations | 15 | REPL loop cap |
timeout_seconds | 120 | Per LLM call timeout |
max_output_chars | 10,000 | Stdout truncation limit |
max_file_size | 1 MB | Skip individual files larger than this |
max_context_bytes | 50 MB | Total codebase size limit |
All values are adjustable at runtime via the config dict.
The REPL runs in a restricted environment:
- Blocked builtins:
eval,exec,compileare removed - Import whitelist: only
re,json,math,collections,itertools,functools,textwrap,difflib,hashlib,datetime,csv,io,os.path,pathlib,string,unicodedata
- MIT CSAIL — The Recursive Language Model research paper this plugin implements
- Anthropic — Claude Code and the Agent Skills standard