Contributing to CodeGraph

Thanks for helping improve CodeGraph. This repo is a Rust workspace with multiple crates under crates/, plus SurrealDB schemas under schema/.

Quick start

Install Rust (via rustup) and any platform prerequisites from docs/INSTALLATION_GUIDE.md.
Build and run a fast hygiene pass:

cargo check --workspace

Run tests (start with the crate you touched):

cargo test -p codegraph-mcp

Format before you send a PR:

cargo fmt

What to change where

Indexing pipeline

Orchestration and analyzer phases: crates/codegraph-mcp/src/indexer.rs
Analyzer implementations: crates/codegraph-mcp/src/analyzers/
Parser and AST extraction: crates/codegraph-parser/src/

Embeddings and reranking

Provider implementations: crates/codegraph-vector/src/
Runtime config loader (TOML + .env + env overrides): crates/codegraph-core/src/config_manager.rs
Provider-specific config structs: crates/codegraph-core/src/config_manager.rs and crates/codegraph-core/src/rerank_config.rs

Built-in agent (agentic MCP tools)

MCP server and tool entrypoints: crates/codegraph-mcp-server/src/official_server.rs
Prompt tier selection: crates/codegraph-mcp-server/src/prompt_selector.rs
Tier prompts per analysis type: crates/codegraph-mcp-server/src/*_prompts.rs
LLM providers: crates/codegraph-ai/src/

Adding a new language

CodeGraph language support typically requires three layers:

Core language enum: add or confirm the language exists in crates/codegraph-core/src/types.rs
Tree-sitter registration: add the grammar dependency and register extensions in crates/codegraph-parser/src/language.rs
Extraction: implement node/edge extraction for the new language in the parser pipeline (follow the existing extractor patterns in crates/codegraph-parser/src/)

After you add a language:

Update docs/SUPPORTED_LANGUAGES.md
Add or update tests that validate language registration and basic parsing

Adding an analyzer

Analyzers run during codegraph index to enrich the initial AST graph.

Guidelines:

Prefer small, composable analyzers that add clearly attributable nodes/edges.
Ensure analyzer output is deterministic and scoped by project_id.
Emit provenance metadata (analyzer name, confidence) so later tools can explain where facts came from.

If your analyzer depends on external tools (e.g. language servers), add the tool requirement in:

crates/codegraph-mcp/src/analyzers/mod.rs

Schema changes (SurrealDB)

Schema files live in schema/:

schema/codegraph.surql (default schema)
schema/codegraph_graph_experimental.surql (experimental graph-oriented schema)

If you change a schema:

Update any schema-level tests that validate required functions/indexes.
Keep SurrealQL compatible with current SurrealDB parsing rules.

Documentation updates

If your change affects configuration, providers, or language support, update:

docs/AI_PROVIDERS.md
docs/AGENT_PROMPT_TIERS.md
docs/SUPPORTED_LANGUAGES.md

PR checklist

cargo test passes for touched crates
cargo fmt clean
No new secrets committed
Docs updated for user-facing behavior changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Contributing to CodeGraph

Quick start

What to change where

Indexing pipeline

Embeddings and reranking

Built-in agent (agentic MCP tools)

Adding a new language

Adding an analyzer

Schema changes (SurrealDB)

Documentation updates

PR checklist

FilesExpand file tree

CONTRIBUTING.md

Latest commit

History

CONTRIBUTING.md

File metadata and controls

Contributing to CodeGraph

Quick start

What to change where

Indexing pipeline

Embeddings and reranking

Built-in agent (agentic MCP tools)

Adding a new language

Adding an analyzer

Schema changes (SurrealDB)

Documentation updates

PR checklist