Local-ready LLM execution harness for complex tasks.
Loom decomposes goals, drives execution through verified steps, routes between thinking and acting models, and keeps work on track with structured state instead of chat history. Use it via TUI, CLI, API, or MCP with local or mixed local/cloud models.
Bring Kimi, Minimax, GLM, Claude, or any OpenAI-compatible model and Loom supplies the harness: tool calling, structured planning, parallel execution, independent verification, and persistent memory.
It handles coding, research, document analysis (PDF, Word docs, PowerPoint decks), report generation, and multi-step business workflows.
Claude-class cowork UX, local-ready. Tools like Claude Code and Claude cowork deliver strong agentic experiences, and Claude Code can be paired with local model stacks depending on your setup. Loom's focus is different: a model-agnostic harness designed to keep local and mixed local/cloud execution reliable with structured planning, tool safety, independent verification, and persistent memory. Loom is also cross-platform, while Claude cowork is currently macOS + Claude-model oriented. The result is an agentic workflow that stays robust on your own hardware without locking you to one provider.
Loom also exposes a REST API and an MCP server built for agentic systems. Orchestrators like OpenClaw can call Loom's REST endpoints -- or connect via the Model Context Protocol -- to offload complex multi-step tasks: decomposition, tool calling, verification, and memory. Instead of hoping a single LLM call handles a 15-step workflow, hand it to Loom and let the harness drive. The MCP integration also means any MCP-compatible agent or IDE can use Loom as a tool provider out of the box.
LLMs are phenomenal at answering a single, well-formed question. Give them a bounded task, and they can produce remarkably strong results.
But real work is rarely a single question. It is a chain of decisions, dependencies, checks, and revisions. In that setting, even strong models can drift: they skip steps, lose context, or confidently invent details. The problem is not intelligence. The problem is process.
Loom exists to provide that process.
Loom breaks complex work into smaller, solvable units, runs them in sequence, and verifies each result before moving forward. That lets you use model intelligence where it is strongest while reducing hallucinations, compounding errors, and dead-end reasoning.
Loom is local-ready and supports local models natively, so you get privacy, control, and lower operating cost. It is not limited to local inference: you can use cloud models too, and apply Loom to far more than coding, including research, analysis, and operational workflows.
Interactive (uv run loom) -- Work with a model in a rich terminal UI. You talk, the model responds and uses tools, you see what it's doing in real time. Streaming text, inline diffs, per-tool-call approval, session persistence, conversation recall, and slash commands for control.
Autonomous (uv run loom run) -- Give Loom a goal, walk away. It decomposes the work into subtasks with a dependency graph, runs independent subtasks in parallel, verifies each result with an independent model, and replans when things go wrong.
+----------------------------+
Goal -> Planner -> | [Subtask A] [Subtask B] | parallel batch
| | | | (if independent)
| Execute Execute |
| Verify Verify |
| Extract* Extract* | * fire-and-forget
+----------------------------+
|
[Subtask C] (depends on A+B)
|
Completed
Built for local model weaknesses. Cloud models reproduce strings precisely. Local models don't -- they drift on whitespace, swap tabs for spaces, drop trailing newlines. Loom's edit tool handles this with fuzzy matching: when an exact string match fails, it normalizes whitespace and finds the closest candidate above a similarity threshold. It also rejects ambiguous matches (two similar regions) so it won't silently edit the wrong place. This is the difference between a tool that works with MiniMax and one that fails 30% of the time.
Lossless memory, not lossy summarization. Most agents compress old conversation turns into summaries when context fills up. This destroys information. Loom takes a different approach: every cowork turn is persisted verbatim to SQLite. When context fills up, old turns drop out of the model's window but remain fully searchable. The model has a conversation_recall tool to retrieve anything it needs -- specific turns, tool call history, full-text search. Resume any previous session exactly where you left off with --resume. This archival guarantee is for cowork history; /run and uv run loom run may semantically compact model-facing payloads to stay within context budgets, while preserving source artifacts/logs.
The harness drives, not the model. The model is a reasoning engine called repeatedly with scoped prompts. The orchestrator decides what happens next: which subtasks to run, when to verify, when to replan, when to escalate. This means a weaker model in a strong harness outperforms a stronger model in a weak one.
Verification as a separate concern. The model never checks its own work. An independent verifier (which can be a different, cheaper model) validates results at three tiers: deterministic checks (does the output exist? does it meet structural requirements?), independent LLM review, and multi-vote consensus for high-stakes changes.
Full undo. Every file write is preceded by a snapshot. You can revert any individual change, all changes from a subtask, or the entire task. The changelog tracks creates, modifies, deletes, and renames with before-state snapshots.
Dozens of built-in tools. Loom includes file operations (read/write/edit/delete/move with fuzzy matching), shell + git safety, ripgrep + glob search, web fetch/search, code analysis (tree-sitter when installed; regex fallback), calculator + spreadsheet operations, document generation, task tracking, and conversation recall. Optional software-development integrations add external coding-agent tools (openai_codex, claude_code, opencode) and WordPress workflow tools (wp_cli, wp_env, block scaffolding, quality gates) behind execution feature flags.
It also ships research helpers (academic search, archives, citations, fact checking, OCR, timeline/inflation analysis, correspondence/social mapping) plus a keyless investment suite for market data, SEC fundamentals, macro regime scoring, factor exposure, valuation, ranking, and portfolio analysis/recommendation. Tools are auto-discovered via __init_subclass__.
Inline diffs. Every file edit produces a unified diff in the tool result. Diffs render with Rich markup syntax highlighting in the TUI -- green additions, red removals. You always see exactly what changed.
Process definitions. YAML-based domain specialization lets you define personas, phase blueprints, verification/remediation policy, evidence contracts, and prompt constraints for any workflow (schema_version: 2). A process can represent a consulting methodology, an investment analysis framework, a research protocol, or a coding standard -- the engine doesn't care. Loom ships with 6 built-in processes and supports installing more from GitHub.
If you're new, start with:
uv syncto install dependencies.uv run loom -w /path/to/workspaceto launch the TUI and run setup./run <goal>inside the TUI for your first harnessed task.uv run loom run "<goal>" -w /path/to/workspacefor autonomous execution.
# Install
uv sync # or: pip install -e .
# Launch — the setup wizard runs automatically on first start
uv run loom -w /path/to/workspace
# With a process definition (explicit run command)
uv run loom -w /path/to/workspace
# /consulting-engagement Analyze client onboarding flow
# Force process orchestration from inside the TUI (no uv run loom serve required)
# /processes # process catalog
# /run Analyze Tesla for investment
# /run problem.md # load goal from workspace file
# /run @problem.md prioritize parser issues
# /run close # close current run tab (with confirmation)
# /investment-analysis Analyze Tesla for investment
# Resume a previous session
uv run loom --resume <session-id>
# Autonomous task execution
uv run loom run "Refactor the auth module to use JWT" --workspace /path/to/project
uv run loom run "Research competitive landscape for X and produce a briefing" -w /tmp/research
uv run loom run "Analyze Q3 financials and flag anomalies" -w /tmp/analysis
# Start the API server (for programmatic access)
uv run loom serveOn first launch, Loom's built-in setup wizard walks you through provider selection, model configuration, and role assignment — all inside the TUI. The wizard writes ~/.loom/loom.toml for you. Run /setup from inside the TUI at any time to reconfigure, or uv run loom setup from the CLI.
You can also create the config manually. Loom reads loom.toml from the current directory or ~/.loom/loom.toml:
[models.primary]
provider = "ollama" # or "openai_compatible" or "anthropic"
base_url = "http://localhost:11434"
model = "kimi-k2.5"
max_tokens = 8192
temperature = 0.1
roles = ["planner", "verifier"]
[models.utility]
provider = "ollama"
base_url = "http://localhost:11434"
model = "minimax-m2.1"
max_tokens = 2048
temperature = 0.0
roles = ["extractor", "executor", "compactor"]
[execution]
max_subtask_retries = 3
max_loop_iterations = 50
max_parallel_subtasks = 3
delegate_task_timeout_seconds = 3600
enable_process_iteration_loops = false
enable_iteration_command_exit_gate = false
[telemetry]
mode = "active" # off | active | all_typed | debug
runtime_override_enabled = true
runtime_override_api_enabled = false
runtime_override_api_token = ""
persist_runtime_override = false
debug_diagnostics_rate_per_minute = 120
debug_diagnostics_burst = 30
[limits.runner]
enable_filetype_ingest_router = true
enable_artifact_telemetry_events = true
artifact_telemetry_max_metadata_chars = 1200
enable_model_overflow_fallback = true
ingest_artifact_retention_max_age_days = 14
ingest_artifact_retention_max_files_per_scope = 96
ingest_artifact_retention_max_bytes_per_scope = 268435456For TUI startup behavior:
[tui]
startup_landing_enabled = true # show landing when startup is not resuming
always_open_chat_directly = false # bypass landing and enter chat immediatelyalways_open_chat_directly takes precedence when set to true.
Three model backends: Ollama, OpenAI-compatible APIs (LM Studio, vLLM, text-generation-webui), and Anthropic/Claude. Models are assigned roles (planner, executor, extractor, verifier, compactor). A common split is stronger model for planning + verification and cheaper model for extraction + execution + compaction.
Manage external MCP servers in ~/.loom/mcp.toml (or workspace ./.loom/mcp.toml):
[mcp.servers.notion]
command = "npx"
args = ["-y", "@modelcontextprotocol/server-notion"]
timeout_seconds = 30
enabled = true
[mcp.servers.notion.env]
NOTION_TOKEN = "${NOTION_TOKEN}"MCP merge precedence is: --mcp-config > ./.loom/mcp.toml > ~/.loom/mcp.toml > legacy [mcp] in loom.toml.
Configured MCP servers are auto-discovered at startup and registered as namespaced tools (mcp.<server>.<tool>).
When a run has an auth context, MCP discovery is scoped to that run's selected
profiles; auth-scoped MCP tools are not leaked into the global registry view
used by other concurrent runs.
For OAuth-enabled remote aliases, use browser-first login:
uv run loom mcp auth login <alias>Use --manual-token --access-token ... only as a headless fallback.
MCP OAuth alias tokens are stored separately in ~/.loom/mcp_oauth_tokens.json.
This store is intentionally separate from /auth profile token refs in
~/.loom/auth.toml.
delegate_task (used by /run) defaults to a 3600s timeout. Configure this in
loom.toml under [execution].delegate_task_timeout_seconds; env override
LOOM_DELEGATE_TIMEOUT_SECONDS still applies when set.
For artifact and overflow transparency telemetry in .events.jsonl, enable
[limits.runner].enable_artifact_telemetry_events (default true; set to false to disable).
Use artifact_telemetry_max_metadata_chars to bound handler metadata payload size.
For operator-facing runtime telemetry verbosity, use [telemetry].mode (off, active,
all_typed, debug). You can inspect/change process-local runtime mode through
GET/PATCH /settings/telemetry (loopback + admin token required when mutation is enabled,
via x-loom-admin-token or Authorization: Bearer ...), or via TUI slash command
/telemetry.
For large fetched binaries/documents (PDFs, Office files, archives), tune
[limits.runner] retention keys to control cleanup pressure:
ingest_artifact_retention_max_age_days,
ingest_artifact_retention_max_files_per_scope, and
ingest_artifact_retention_max_bytes_per_scope.
Enable runtime latency diagnostics:
LOOM_LATENCY_DIAGNOSTICS=1 uv run loomThis emits low-overhead timing lines for key paths (event-loop lag probes, MCP discovery/refresh, process index refresh, setup discovery, and API task preflight timing).
Run local startup/discovery latency smoke checks:
uv run python scripts/latency_smoke.py
uv run python scripts/latency_smoke.py --iterations 5 --workspace /path/to/workspaceThe script reports mean/p50/p95 timings for process catalog scan and tool registry creation with sync vs background MCP startup modes.
Loom now uses explicit schema migrations tracked in schema_migrations.
On startup, existing DBs are upgraded before runtime features are allowed to
continue. If an existing DB cannot be upgraded, startup fails with a clear
actionable error instead of silently falling back to ephemeral mode.
For new DB initialization failures (path/permissions), startup is also a hard
error by default; use --ephemeral to opt into non-persistent mode.
Use:
uv run loom db status
uv run loom db migrate
uv run loom db doctor
uv run loom db backup# explicit non-persistent mode when DB init fails
uv run loom --ephemeralSee migration authoring and policy details in
docs/DB-MIGRATIONS.md.
Loom stores credential metadata (not plaintext secrets) in ~/.loom/auth.toml.
Workspace defaults can be set in ./.loom/auth.defaults.toml.
Resource registry + bindings are stored in ./.loom/auth.resources.toml.
Auth resolution happens at run start using this order:
- Explicit run overrides by
resource_id/resource_ref/ provider selector - Workspace resource defaults (
resource_id -> profile_id) - User resource defaults (
resource_id -> profile_id) - Legacy provider defaults (workspace, then user)
- Auto-select when exactly one profile matches a required resource
Required auth resources are collected from:
- Process
auth.required - Declared tool
auth_requirementson all allowed tools for the process (tools excluded byprocess.tools.excludedare not considered)
UX behavior:
- TUI
/authauto-discovers missing resource contracts and seeds draft profiles. - TUI
/rundoes not require a pre-step/auth select .... - If multiple profiles match, Loom prompts for a choice at run start.
- If auth is missing/invalid/expired/unbound, run start offers opening Auth Manager and retries preflight after changes.
- If exactly one profile exists for a resource, it is auto-defaulted for that workspace.
CLI:
uv run loom auth syncruns the same draft/resource sync flow headlessly.uv run loom auth select <selector> <profile_id>persists a workspace default (provider selector or resource selector such asapi_integration:notion).uv run loom auth auditreports orphaned/dangling auth state and exits non-zero on findings.uv run loom auth migrateinfers resource bindings/defaults from legacy provider state.uv run loom auth migratetakes a preflight snapshot and auto-rolls back if migration fails.uv run loom auth migrate --rollback <snapshot-dir>restores auth files from a migration snapshot.
API behavior:
- Non-interactive unresolved auth returns HTTP 400 with structured
code=auth_unresolvedpayload so clients can choose a profile and retry.
A process definition injects a persona, phase blueprint, verification/remediation policy, evidence schema, and prompt constraints without changing engine code. Loom ships with 6 built-in processes: investment analysis, marketing strategy, research report, competitive intelligence, consulting engagement, and market research. You can create your own or install them from GitHub:
uv run loom processes # list available
uv run loom -w /tmp/acme
# /consulting-engagement Draft a Q2 market strategy
uv run loom install user/repo # install from GitHub
uv run loom install user/repo --isolated-deps # per-process dependency env
uv run loom process test consulting-engagement # run process test casesProcess-required tools are enforced at runtime: if tools.required contains
missing tools, process activation/task creation fails fast with a clear error.
Process contract v2 is the recommended authoring format (schema_version: 2),
with behavior declared under verification.policy, verification.remediation,
evidence, validity_contract, and prompt_contracts. v1 definitions still load in compatibility
mode, with compatibility removal targeted for June 30, 2026.
Current schema additions for process safety/evidence rigor:
risk_level(low|medium|high|critical) to make default rigor floors explicit.validity_contractfor claim extraction, prune behavior, and synthesis gates.final_gate.temporal_consistencyfor as-of alignment, stale-source checks, and date-conflict detection.
schema_version: 2
risk_level: high
validity_contract:
enabled: true
claim_extraction: { enabled: true }
min_supported_ratio: 0.8
max_unverified_ratio: 0.2
max_contradicted_count: 0
prune_mode: rewrite_uncertainty
final_gate:
enforce_verified_context_only: true
synthesis_min_verification_tier: 2
critical_claim_support_ratio: 1.0
temporal_consistency:
enabled: true
require_as_of_alignment: true
enforce_cross_claim_date_conflict_check: true
max_source_age_days: 365Both ad hoc and defined process runs use the same enforcement path for these
contracts (claim pruning/intermediate continuation, synthesis gating, and final
failure on unsupported critical claims). For full authoring + migration guidance,
see docs/creating-packages.md.
Run telemetry now includes run_validity_scorecard and
artifact_seal_validation for evidence/provenance audits.
Loom learns from your interactions so you never repeat yourself. Two learning modes work together:
Operational learning (autonomous tasks) -- after every task, Loom extracts model success rates, retry patterns, and successful plan templates. These inform future model selection and planning.
Behavioral learning (all interactions) -- Loom detects the gap between what the model delivered and what you actually wanted. When you say "test and lint it" after the model considers its code done, that's a gap signal. Loom extracts a general behavioral rule ("run tests and linter after writing code") and injects it into future prompts. Explicit corrections ("no, use JSON not YAML") are captured the same way.
Patterns are frequency-weighted -- the more a pattern is observed, the higher it ranks. High-frequency patterns persist indefinitely; low-frequency ones are pruned after 90 days. All data stays local in your SQLite database.
uv run loom learned # review learned behavioral patterns
uv run loom learned --all # include internal operational patterns
uv run loom learned --type behavioral_gap # filter by type
uv run loom learned --delete 5 # remove a specific pattern
uv run loom reset-learning # clear everythingIn the TUI, use /learned to open an interactive review screen for learned behavioral patterns, where you can inspect and delete individual items.
- Interactive TUI (
uv run loom) -- rich terminal interface with chat panel, sidebar, diff viewer, tool approval modals, event log, and setup wizard. Includes session persistence/recall, task delegation, process discovery (/processes), in-process orchestration (/run <goal|@goal-file [goal]|close [run-id-prefix]>), direct process commands (/<process-name> <goal>), learned-pattern review (/learned), MCP config controls (/mcp ...), auth profile controls (/auth ...), and click-to-open workspace previews for Markdown/code/JSON/CSV/HTML/diff/Office/PDF/images.Ctrl+Wcloses the active process-run tab with confirmation./runexecutes in-process and does not requireuv run loom serve;/run problem.mdand/run @problem.md optional-goalload file content into planning context. - REST API -- 20 endpoints for task CRUD, SSE streaming, steering, approval, feedback, memory search
- MCP server -- Model Context Protocol integration so other agents can use Loom as a tool
uv run loom Launch the interactive TUI (default; setup wizard on first run)
uv run loom cowork Alias for the interactive TUI
uv run loom setup Run the configuration wizard (CLI fallback)
uv run loom run GOAL Autonomous task execution (server-backed) with `/run`-equivalent process resolution
uv run loom serve Start the API server
uv run loom status ID Check task status
uv run loom cancel ID Cancel a running task
uv run loom models List configured models
uv run loom auth ... Manage auth profiles/default selectors
uv run loom processes List available process definitions
uv run loom install SOURCE Install a process package
uv run loom uninstall NAME Remove a process package
uv run loom process test NAME Run process package test cases
uv run loom mcp ... Manage MCP server config (list/show/add/edit/remove/test/migrate)
uv run loom mcp-serve Start the MCP server (stdio transport)
uv run loom learned Review learned patterns (behavioral by default)
uv run loom reset-learning Clear all learned patterns
Common flags for uv run loom / uv run loom cowork:
-w /path-- workspace directory--mcp-config /path/to/mcp.toml-- explicit MCP config layer-m model-- explicit cowork model override from config (can bypass role routing)--resume <id>-- resume a previous session
Role routing note:
- Orchestrator and verifier paths route by role (
planner,executor,extractor,verifier,compactor). - TUI helper calls (ad hoc process synthesis, run-folder naming) use role-selected helper models when configured.
- Run-folder naming is guardrailed: Loom accepts only clean kebab-case slugs and falls back to deterministic naming when model output is low quality.
Large Python codebase in src/ with an extensive automated test suite. No frameworks (no LangChain, no CrewAI).
src/loom/
__main__.py CLI (Click), TUI launcher (default command)
config.py TOML config loader
mcp/ MCP config manager + merge/migration logic
api/ FastAPI server, REST routes, SSE streaming
cowork/ Conversation session, approval, session state
engine/ Orchestrator, subtask runner, scheduler, verification
events/ Pub/sub event bus, persistence, webhooks
integrations/ MCP server
learning/ Pattern extraction from execution history
models/ Provider ABC + Ollama, OpenAI, Anthropic backends
processes/ Process definition loader + 6 built-in YAML processes
prompts/ 7-section prompt assembler with budget trimming
recovery/ Approval gates, confidence scoring, retry escalation
state/ Task state, SQLite memory archive, conversation store
tools/ 30 built-in tools with auto-discovery, safety, changelog + tree-sitter backend
tui/ Textual TUI: chat, sidebar, diff viewer, modals, events
uv sync --extra dev # or: pip install -e ".[dev]"
pytest # full test suite
ruff check src/ tests/ # lint- Python 3.11+
- A model backend: Ollama, LM Studio, any OpenAI-compatible API, or Anthropic/Claude
MIT

