Skip to content

Comments

fix: Windows homedir, multi-session MCP, GPU override, collection routing#235

Open
kenaroik wants to merge 1 commit intotobi:mainfrom
kenaroik:windows-and-multi-session-mcp
Open

fix: Windows homedir, multi-session MCP, GPU override, collection routing#235
kenaroik wants to merge 1 commit intotobi:mainfrom
kenaroik:windows-and-multi-session-mcp

Conversation

@kenaroik
Copy link

Summary

Four improvements for Windows compatibility, multi-client HTTP server stability, GPU troubleshooting, and per-project collection scoping.

1. Fix: Windows home directory resolution

src/store.ts determined the home directory with process.env.HOME || "/tmp". On Windows, HOME is not a standard environment variable — it's set in Git Bash and WSL, but not in cmd.exe or PowerShell. This caused QMD to silently write its SQLite index to C:\tmp\.cache\qmd\ (or fail entirely), instead of the user's actual home directory.

Fix: Replaced with Node.js's cross-platform os.homedir(), which correctly resolves on all platforms (Windows: C:\Users\<name>, macOS: /Users/<name>, Linux: /home/<name>).

2. Feature: QMD_GPU environment variable

node-llama-cpp ships pre-built CUDA binaries linked against a specific cuBLAS version. When the user's installed CUDA toolkit has a different major version (e.g., the binary needs cublas64_13.dll but CUDA 12.x only ships cublas64_12.dll), GPU initialization crashes with ggml-cuda.cu: CUDA error. Rebuilding from source requires Visual C++ Build Tools, which many users don't have.

Fix: Added QMD_GPU env var to ensureLlama() in src/llm.ts. Users can bypass broken CUDA with:

QMD_GPU=vulkan qmd embed     # Use Vulkan instead of CUDA
QMD_GPU=false qmd embed      # Force CPU-only

Values: cuda, metal, vulkan, false/off/0 (CPU). When not set, behavior is unchanged (auto-detect best GPU).

3. Fix: Per-session MCP HTTP transport

The HTTP server (qmd mcp --http) previously created a single McpServer + WebStandardStreamableHTTPServerTransport pair at startup. All connecting clients shared this one transport, which caused session conflicts:

  • Multiple Claude Code instances connecting to the same daemon would interfere with each other's sessions
  • Reconnecting after a disconnect could fail because the transport still held state from the old connection
  • Session ID collisions between unrelated clients

Fix: Each initialize request now creates a dedicated session — its own transport and McpServer instance — tracked by session ID in a Map. All sessions share the underlying SQLite store (which is already thread-safe). On shutdown, all active sessions are closed cleanly. The /health endpoint now reports the active session count.

4. Feature: URL-based collection routing (/mcp/COLLECTION)

When QMD runs as a shared HTTP daemon (qmd mcp --http --daemon), there's no way to pass per-project environment variables. Different projects may want to search different collections by default.

Fix: The URL path now optionally includes a collection name:

URL Behavior
/mcp All collections (unchanged)
/mcp/notes Scoped to "notes" collection
/mcp/work-docs Scoped to "work-docs" collection

The collection name is extracted on initialize and applied as the default for all searches in that session. The LLM instructions reflect the scoped collection name and document count.

Configuration example — in per-project Claude Code settings (.claude/settings.json):

{
  "mcpServers": {
    "qmd": {
      "type": "http",
      "url": "http://localhost:8181/mcp/my-project-docs"
    }
  }
}

For stdio transport, the QMD_COLLECTION env var provides the same scoping.

Files changed

File Change
src/store.ts process.env.HOME || "/tmp"os.homedir()
src/llm.ts QMD_GPU env var in ensureLlama()
src/mcp.ts Per-session transport, collectionFromPath(), resolveTransport(), createSession(), buildInstructions() collection override, health endpoint session count

Test plan

  • TypeScript builds cleanly (tsc -p tsconfig.build.json)
  • Existing test suite passes (3 pre-existing Windows-only failures unrelated to this PR: symlink tests require admin privileges)
  • GET /health returns { sessions: N }
  • POST /mcp initializes with all collections, instructions show total doc count
  • POST /mcp/RAMP initializes scoped to RAMP, instructions show scoped doc count and collection name
  • Multiple concurrent sessions work independently
  • QMD_GPU=vulkan successfully bypasses broken CUDA initialization

…ting

- fix(store): use os.homedir() instead of process.env.HOME || "/tmp"
  On Windows cmd.exe, HOME is not set, causing QMD to write its index
  to /tmp/.cache/qmd/ instead of the user's home directory.

- feat(llm): add QMD_GPU env var to override GPU backend
  Allows forcing cuda, metal, vulkan, or CPU (false/off/0) when the
  auto-detected GPU fails (e.g., CUDA version mismatch with pre-built
  binaries).

- fix(mcp): per-session transport for HTTP server
  Each client now gets its own transport + McpServer instance, sharing
  one SQLite store. Previously, multiple clients (e.g., several Claude
  Code instances) sharing one HTTP daemon would conflict because they
  shared a single transport.

- feat(mcp): URL-based collection routing (/mcp/COLLECTION)
  Connecting to /mcp/RAMP scopes all searches to the RAMP collection.
  Enables per-project collection scoping when using HTTP transport,
  since env vars cannot be passed per-session to a shared daemon.
  Also supports QMD_COLLECTION env var for stdio transport.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants