🚀 Nekton MCP Server

A production-grade Model Context Protocol (MCP) server that augments local LLM inference (via vLLM or Ollama) with real-time external data (e.g., weather, finance, news) to make responses dynamic, accurate, and context-aware.

This system is designed for modularity, scalability, and performance in local or hybrid LLM deployments with support for concurrent multi-user interactions.

🧠 Key Features

✅ LLM-Aware Prompt Augmentation
🔌 Real-Time Tool Integration (Weather, Finance, News APIs)
🧱 Modular Architecture (Plug-in style tool injection)
🚀 High-performance Go backend
📦 Support for vLLM/Ollama via HTTP
🌐 React frontend
🛠️ Testable and extensible core logic
📊 Observability + Tracing support
🐳 Docker/Kubernetes deployment ready
👥 Concurrent Multi-User Support: Handles simultaneous user queries while maintaining context isolation and data security.

🗂️ Project Structure

nekton/
├── nekton-client/                        # React frontend
│   ├── web/                              # Frontend (React)
│   ├── Makefile
│   └── README.md
│
└── nekton-server/                        # Backend (Hexagonal MCP)
    ├── cmd/                              # Entrypoints and wire-up
    │   └── api/
    │       ├── main.go                   # Starts the server
    │       └── container.go              # DI container (using dig or fx)
    │
    ├── internal/
    │   ├── domain/                       # Core business logic (independent)
    │   │   ├── contextor/                # Planner, enricher, prompt builder
    │   │   │   ├── planner.go
    │   │   │   ├── enricher.go
    │   │   │   └── prompt_builder.go
    │   │   ├── tool/                     # Tool behavior and data contract
    │   │   │   └── tool.go
    │   │   ├── llm/                      # LLM inference port interface
    │   │   │   └── llm.go
    │   │   ├── session/                  # Session management for multi-user support
    │   │   │   └── session_manager.go    # Handles user session lifecycle
    │   │   └── port/                     # Hexagonal ports (interfaces)
    │   │       ├── tool_port.go
    │   │       ├── llm_port.go
    │   │       ├── session_port.go
    │   │       └── audit_port.go
    │
    │   ├── adapter/                      # External systems (driven adapters)
    │   │   ├── http/                     # HTTP server adapter (fasthttp)
    │   │   │   └── handler.go
    │   │   ├── llm/                      # LLM backend (vLLM/Ollama)
    │   │   │   └── vllm_client.go
    │   │   ├── tool/
    │   │   │   ├── weather_adapter.go
    │   │   │   ├── finance_adapter.go
    │   │   │   └── news_adapter.go
    │   │   ├── redis/                    # Redis for sessions + cache
    │   │   │   └── redis_store.go
    │   │   ├── kafka/                    # Kafka producer/consumer
    │   │   │   └── kafka_client.go
    │   │   └── postgres/                 # PostgreSQL audit + cold data
    │   │       └── audit_logger.go
    │
    │   ├── infra/                        # Logging, config, tracing, shared
    │   │   ├── config/
    │   │   ├── logger/
    │   │   ├── observability/
    │   │   └── errors/
    │
    │   └── tests/                        # Unit + integration tests
    │       ├── mocks/
    │       └── integration/
    │
    ├── scripts/                          # Bootstrap & operational scripts
    ├── api/                              # OpenAPI / gRPC definitions
    ├── deployments/                      # Docker + K8s manifests
    ├── Makefile
    ├── go.mod
    └── README.md

User Flow

%%{
  init: {
    'theme': 'base',
    'themeVariables': {
      'primaryColor': '#BB2528',
      'primaryTextColor': '#fff',
      'primaryBorderColor': '#7C0000',
      'lineColor': '#F8B229',
      'secondaryColor': '#006100',
      'tertiaryColor': '#fff',
      'lineWidth': 8,
      'fontSize':30
    }
  }
}%%
flowchart TD
    A[User Input] --> B[API Gateway HTTP/gRPC]
    B --> C[Context Planner]
    C --> D[Tool Orchestrator]
    D --> E[Prompt Builder]
    E --> F[LLM Inference Engine<br/>vLLM / Ollama]
    F --> G[Response Handler]
    G --> H[User Output]
    D --> T[Weather, Finance, News APIs]
    T --> D
    D --> C[Feedback to Planner]
    B --> I[Session Manager] --> J[Manage User Context]
    I --> K[Redis Cache]

🧩 Architecture Overview

%%{
  init: {
    'theme': 'base',
    'themeVariables': {
      'primaryColor': '#BB2528',
      'primaryTextColor': '#fff',
      'primaryBorderColor': '#7C0000',
      'lineColor': '#F8B229',
      'secondaryColor': '#006100',
      'tertiaryColor': '#fff',
      'lineWidth': 8,
      'fontSize':30
    }
  }
}%%
flowchart TD
    UI[Client UI] --> APIGW[REST/gRPC API Gateway]

    subgraph Gateway
        APIGW --> Auth[Auth + Rate Limiting]
        APIGW --> Planner[Context Planner + Tool Router]
        APIGW --> Redis[Session Cache Redis]
    end

    Planner -->|Dispatch| WeatherTool[Weather Tool]
    Planner -->|Dispatch| FinanceTool[Finance Tool]
    Planner -->|Dispatch| NewsTool[News Tool]

    WeatherTool --> WeatherEnricher[Weather Tool Enricher]
    FinanceTool --> FinanceEnricher[Finance Tool Enricher]
    NewsTool --> NewsEnricher[News Tool Enricher]

    WeatherEnricher --> PromptBuilder
    FinanceEnricher --> PromptBuilder
    NewsEnricher --> PromptBuilder

    PromptBuilder[Prompt Builder] --> LLM[vLLM / Ollama]
    LLM --> RespBuilder[Response Builder]
    RespBuilder --> Postgres[PostgreSQL audit + cold DB]
    APIGW --> SessionManager[Session Manager] --> Redis

🧩 Multi-User Support Details

Session Management: The SessionManager component ensures that each user's query is handled in an isolated context. It supports user authentication, maintains state across requests, and stores session data in Redis for fast retrieval.
Context Isolation: Each user interaction with the LLM is processed independently, ensuring that context for one user does not interfere with another. This is critical for handling concurrent users.
Rate Limiting: Each user is subject to configurable rate limits to prevent abuse and ensure system stability.

🛠 Core Components

`cmd/`

Service entrypoints:

api-gateway: Starts the HTTP API using fasthttp for high-performance routing
worker: Background job runner (e.g., cron for refreshes)
tools-runner: Manual tool testing CLI

`internal/api/`

Versioned REST/gRPC API layer.

handlers/: Input/output translation
controllers/: Business logic
middleware/: Auth, logging, rate limiting
schemas/: Request/response type definitions

`internal/contextor/`

The MCP core. Determines what tools to use and how to inject data:

engine.go: Overall pipeline
planner.go: Chooses relevant tools
enricher.go: Gathers external context
prompt_builder.go: Final prompt construction

`internal/inference/`

LLM client abstraction:

client.go: HTTP interface with vLLM or Ollama
models.go: Model registry/config
streamer.go: Streaming completion support

`internal/tools/`

Modular tool adapters:

weather/: OpenWeatherMap, Tomorrow.io, etc.
finance/: Stock data, crypto, market sentiment
news/: RSS feeds, Google News, etc.

Each tool has:

provider.go: Fetches external data
enricher.go: Formats data into LLM-ready prompt fragments

`internal/core/`

Shared infrastructure:

config/: Env loading, config structs
logger/: Structured logging (zap or slog)
observability/: Prometheus, tracing
errors/: App-specific error types

`web/`

React client:

Live chat interface with streaming response
Model switcher and tool visualizer

🧪 Testing Strategy

Unit tests for context planners, prompt builders, tools
Integration tests for full input → output validation
Mock tools and inference clients for repeatable tests

🚀 Quick Start

Prerequisites

Go 1.21+ - Install Go
vLLM or Ollama - For LLM inference
- vLLM (default): pip install vllm - See vLLM documentation
- Ollama (alternative): curl -fsSL https://ollama.ai/install.sh | sh
Optional Services:
- Redis - For session persistence
- Kafka - For tool orchestration
- PostgreSQL - For audit logging

Basic Setup

# Clone the repository
git clone https://github.com/echenim/Nekton-Server.git
cd Nekton-Server

# Install dependencies
make deps

# Setup vLLM (default LLM provider)
pip install vllm
# Start vLLM server in another terminal
python -m vllm.entrypoints.openai.api_server --model meta-llama/Llama-3.2-3B-Instruct --port 8000

# Or setup Ollama (alternative)
# curl -fsSL https://ollama.ai/install.sh | sh
# ollama serve  # Start in another terminal
# ollama pull llama2  # Pull a model

# Copy and configure
cp config.example.yml config.yml
# Edit config.yml as needed

# Build and run
make build
./bin/nekton-server

Verify Installation

# Check LLM connectivity
./check_llm.sh

# Test endpoints
./test_endpoints.sh

# Access Swagger UI
open http://localhost:8080/swagger/

See LLM Setup Guide for detailed configuration.

⚙️ Deployment

Docker Compose (Dev)

docker-compose -f deployments/docker/docker-compose.yml up --build

Kubernetes (Prod)

kubectl apply -f deployments/k8s/

Includes:

MCP service
LLM inference (vLLM)
Redis (required for caching and session data)
Kafka (required for real-time data pipeline)
PostgreSQL (required for cold data storage and audit logging)
Prometheus + Grafana (optional)

📚 API Documentation

Swagger/OpenAPI Documentation

When the server is running, you can access the interactive API documentation at:

Swagger UI: http://localhost:8080/swagger/ (or http://localhost:8080/docs/)
Swagger JSON: http://localhost:8080/swagger/doc.json

API Endpoints

System Endpoints

GET /health - Health check endpoint (no authentication required)
GET /api/health - Alternative health check endpoint
GET /debug/routes - Debug endpoint to list all registered routes

AI Inference Endpoints

POST /api/v1/infer - Generate AI response
- Requires: JSON body with query field
- Optional: model, session_id, user_id
- Returns: JSON response with AI-generated text
POST /api/v1/infer/stream - Generate streaming AI response (Server-Sent Events)
- Requires: JSON body with query field
- Optional: model, session_id, user_id
- Returns: SSE stream with incremental AI responses

Testing/Session Management Endpoints

POST /api/v1/sessions - Create a test session
- Requires: JSON body with user_id field
- Returns: Session details including session_id, created_at, expires_at
- Purpose: For testing multi-user session functionality
POST /api/v1/sessions/validate - Validate an existing session
- Requires: JSON body with session_id field
- Returns: Session details if valid, 404 if not found or expired
- Purpose: Check if a session is still valid

Development Commands

To regenerate the Swagger documentation after making API changes:

make swagger

To verify Swagger documentation is up to date:

make swagger-check

Testing Endpoints

Test scripts are provided to verify endpoints:

# Test all basic endpoints
./test_endpoints.sh

# Test session management functionality
./test_sessions.sh

# Test swagger documentation
./test_swagger.sh

📚 Usage Example

Input

{
  "query": "What’s the weather like in Tokyo right now?",
  "model": "llama3-8b-instruct"
}

MCP-Generated Prompt

[TOOL: Weather API]
Current weather in Tokyo (as of 2025-07-25 12:00 JST): 28°C, clear skies.

Answer the following query using the context above:
What’s the weather like in Tokyo right now?

Output

It's currently 28°C with clear skies in Tokyo. A perfect day for a walk!

🧱 Extending the Protocol

To add a new tool:

Create a new folder under internal/tools/your_tool/
Implement:
- provider.go (API fetching logic)
- enricher.go (how to turn it into prompt text)
Register it in planner.go and optionally config.yaml

📬 Future Enhancements

✅ Tool usage tracing + debugging UI
✅ Function calling parser (OpenAI-compatible)
🟡 Dynamic tool chaining (multi-hop)
🟡 Local RAG support (knowledge base integration)
🟡 Authenticated user sessions

👨‍💻 Maintainers

William Echenim (Architect)
william.echenim@gmail.com

📄 License

MIT License — see LICENSE file.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
cmd		cmd
docs		docs
internal		internal
scripts		scripts
.dockerignore		.dockerignore
.gitignore		.gitignore
API_USAGE.md		API_USAGE.md
Architecture.md		Architecture.md
Architecture_Design_Patterns.md		Architecture_Design_Patterns.md
CONFIG.md		CONFIG.md
Dockerfile		Dockerfile
Dockerfile.dev		Dockerfile.dev
Makefile		Makefile
README.md		README.md
check_llm.sh		check_llm.sh
config.example.yml		config.example.yml
go.mod		go.mod
go.sum		go.sum
nekton-server		nekton-server
run_tests.sh		run_tests.sh
test_endpoints.sh		test_endpoints.sh
test_sessions.sh		test_sessions.sh
test_swagger.sh		test_swagger.sh

echenim/Nekton-Server

Folders and files

Latest commit

History

Repository files navigation

🚀 Nekton MCP Server

🧠 Key Features

🗂️ Project Structure

User Flow

🧩 Architecture Overview

🧩 Multi-User Support Details

🛠 Core Components

cmd/

internal/api/

internal/contextor/

internal/inference/

internal/tools/

internal/core/

web/