GitHub - brainqub3/brainqub3_chat: A Chat web client for chatting with open-source LLMs deployed behind a vLLM inference server

Brainqub3 Chat

Brainqub3 Chat is a local-first Next.js workspace that wraps a Brainqub3-themed chat UI, a vLLM OpenAI-compatible orchestrator, and an MCP bridge so models can call local tools. The UI, API routes, and MCP helpers all run on your machine; only the vLLM endpoint needs to be reachable over HTTP, which can be local or a remote RunPod deployment.

Features

Brainqub3 UI: Always-dark layout with cyan/purple glow accents, Geist typography, streaming messages, caret animation, and keyboard shortcuts (⌘/Ctrl+K for new chats, ⌘/Ctrl+Enter to send, ↑ edits last prompt).
Session management: Multi-session rail with previews, rename-on-first-answer, delete, and automatic persistence to localStorage.
vLLM orchestrator: /api/chat forwards OpenAI-style chat completion requests (with tool_choice:"auto") to a configurable VLLM_BASE_URL, loops through tool calls, and streams Server-Sent Events back to the browser.
MCP bridge (experimental): /api/mcp/* endpoints manage stdio or HTTP MCP servers via the official TypeScript SDK, exposing each MCP tool as an OpenAI tool (mcp:<serverId>:<toolName>). This pathway is currently untested end-to-end, so expect to troubleshoot transports if you enable it.
Local controls: Model picker, per-session system prompt pill, estimated token budget bar, and live MCP server status cards.

Local setup

Prerequisites

Node.js 18+ – required for the App Router and Node runtime routes.
vLLM endpoint – any OpenAI-compatible vLLM server. You can:
- Run it locally (see Running vLLM locally below), or
- Deploy your own container on RunPod. Start from the RunPod vLLM template, then follow the official RunPod docs to finish configuring the worker and expose the HTTPS endpoint you will paste into VLLM_BASE_URL.
Optional: any MCP servers (stdio binaries or HTTP endpoints) you want the model to call.

1. Install dependencies

npm install

2. Configure environment

Create .env.local in the project root:

VLLM_BASE_URL=http://localhost:8000              # or the HTTPS URL from your RunPod deployment
DEFAULT_MODEL=moonshotai/Kimi-K2-Thinking        # server-side default passed to vLLM
NEXT_PUBLIC_DEFAULT_MODEL=moonshotai/Kimi-K2-Thinking

The defaults ship with the Kimi K2 model; change both variables if you point at a different checkpoint. DEFAULT_MODEL drives the API’s fallback choice, while NEXT_PUBLIC_DEFAULT_MODEL seeds new sessions on the client.

3. Run the services locally

Running vLLM locally

python -m vllm.entrypoints.openai.api_server \
  --model moonshotai/Kimi-K2-Thinking \
  --host 0.0.0.0 --port 8000

If you prefer an on-demand GPU endpoint, deploy the RunPod template linked above, then set VLLM_BASE_URL to the provided HTTPS endpoint. The Next.js app treats local and remote URLs the same.

Start the Next.js workspace

npm run dev

Visit http://localhost:3000, create a chat, and start messaging. Expand MCP Servers to register stdio or HTTP transports; enabled servers automatically expose their tools to the model and show up under the “+Tools” indicator.

Development notes

API routes opt into the Node runtime so MCP stdio transports can spawn child_process instances.
Streaming uses Server-Sent Events and eventsource-parser to buffer tool-call metadata while emitting text deltas immediately.
The MCP registry lives in memory; restarting npm run dev clears MCP state, but chat sessions remain in the browser thanks to localStorage.
Tailwind centralizes Brainqub3 design tokens (glows, gradients, type scale) so the sidebar, chat pane, and tool cards stay consistent.

NPM scripts

npm run dev – Next.js dev server
npm run build – production build
npm run start – serve the production build
npm run lint – ESLint

Possible enhancements

Ideas for later: persist sessions to disk, add an MCP prompt/resource library, or integrate token-aware summarization when transcripts approach the context window.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
public		public
src		src
.env.example		.env.example
.eslintrc.json		.eslintrc.json
.gitignore		.gitignore
README.md		README.md
next.config.ts		next.config.ts
package-lock.json		package-lock.json
package.json		package.json
postcss.config.mjs		postcss.config.mjs
tailwind.config.ts		tailwind.config.ts
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Brainqub3 Chat

Features

Local setup

Prerequisites

1. Install dependencies

2. Configure environment

3. Run the services locally

Running vLLM locally

Start the Next.js workspace

Development notes

NPM scripts

Possible enhancements

About

Uh oh!

Releases

Packages

Languages

brainqub3/brainqub3_chat

Folders and files

Latest commit

History

Repository files navigation

Brainqub3 Chat

Features

Local setup

Prerequisites

1. Install dependencies

2. Configure environment

3. Run the services locally

Running vLLM locally

Start the Next.js workspace

Development notes

NPM scripts

Possible enhancements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages