Quality Signals
- Quick Validate: GitHub fast checks (Python compile smoke, CI-lite dependency audit, frontend script sanity).
- GitGuardian: secret detection and leak prevention for repository history and pull requests.
- Quality Gate Status: SonarCloud quality gate for backend and frontend.
Venom is an autonomous multi-agent engineering system with planning, tool execution, and persistent memory.
- ๐ง Strategic Planning โ Breaks complex goals into executable steps.
- ๐ค Agent Orchestration โ Routes tasks to specialized agents.
- ๐ Tooling + MCP Import โ Uses local tools and imports MCP tools from Git.
- ๐งญ Runtime LLM Selection โ Switches Ollama / vLLM from the UI.
- ๐พ Long-term Memory โ Stores and reuses lessons and context.
- ๐ Learning by Observation โ Records demonstrations and builds workflows.
- ๐๐ Quality Loop โ User feedback, logs, and response quality metrics.
- ๐ง Hidden Prompts โ Approved responses stored as contextual shortcuts.
- ๐ฌ Chat Continuity โ Session history per
session_idacross restarts. - ๐ ๏ธ Services Panel โ
/configshows runtime status of local stack.
See docs/ for architecture, frontend guide, and testing policy.
# 1. Searching for current information
"What is the current Bitcoin price?"
โ System automatically searches the Internet and returns fresh data
# 2. Complex projects with planning
"Create a Snake game using PyGame"
โ System:
1. Finds PyGame documentation (ResearcherAgent)
2. Creates game structure (CoderAgent)
3. Adds snake logic (CoderAgent)
4. Implements scoring (CoderAgent)
# 3. Multi-file webpage
"Create an HTML page with a digital clock and CSS styling"
โ System creates separately: index.html, style.css, script.js
# 4. NEW: Learning by demonstration
"Venom, watch how I send a report to Slack"
โ [User performs actions]
โ System records, analyzes and generates workflow
โ "Saved as skill 'send_slack_report'"
โ Later: "Venom, send report to Slack" - executes automatically!venom_core/
โโโ api/routes/ # REST API endpoints (agents, tasks, memory, nodes)
โโโ core/flows/ # Business flows and orchestration
โโโ agents/ # Specialized AI agents
โโโ execution/ # Execution layer and model routing
โโโ perception/ # Perception (desktop_sensor, audio)
โโโ memory/ # Long-term memory (vectors, graph, workflows)
โโโ infrastructure/ # Infrastructure (hardware, cloud, message broker)
- ArchitectAgent - Project manager, breaks down complex tasks into steps
- ExecutionPlan - Execution plan model with defined steps and dependencies
- ResearcherAgent - Gathers and synthesizes knowledge from the Internet
- WebSearchSkill - Search (DuckDuckGo) and scraping (trafilatura)
- MemorySkill - Long-term memory (LanceDB)
- CoderAgent - Generates code using knowledge
- CriticAgent - Verifies code quality
- LibrarianAgent - Manages files and project structure
- ChatAgent - Conversation and assistant
- GhostAgent - GUI automation (RPA - Robotic Process Automation)
- ApprenticeAgent - Learning workflows through observation (NEW!)
- HybridModelRouter (
venom_core/execution/model_router.py) - Intelligent routing between local LLM and cloud - Operating Modes: LOCAL (local only), HYBRID (mix), CLOUD (mainly cloud)
- Local First: Privacy and $0 operational costs
- Providers: Ollama/vLLM (local), Google Gemini, OpenAI
- Sensitive data NEVER goes to the cloud
- Runtime as API: model engine is treated as a replaceable HTTP server - we can run it or not, without impacting base logic. This allows using different model standards.
- LLM-first Direction (Ollama): in single-user mode and low query intensity, Ollama's performance is practically comparable to vLLM, and model switching is simpler. vLLM gains an advantage mainly with high parallelism and heavy load.
- DemonstrationRecorder - Recording user actions (mouse, keyboard, screenshots)
- DemonstrationAnalyzer - Behavioral analysis and transformation pixels โ semantics
- WorkflowStore - Procedure repository with editing capability
- GhostAgent Integration - Executing generated workflows
- Orchestrator - Main system coordinator
- IntentManager - Intent classification (5 types: CODE_GENERATION, RESEARCH, COMPLEX_PLANNING, KNOWLEDGE_SEARCH, GENERAL_CHAT)
- TaskDispatcher - Task routing to appropriate agents
- Backend API (FastAPI/uvicorn) and Next.js UI โ basic processes.
- LLM Servers: Ollama, vLLM โ start/stop from services panel.
- LanceDB โ local vector memory (embedded); Redis โ optional broker/locks (can be disabled).
- Nexus, Background Tasks โ optional spots for future processes (disabled by default, no start/stop actions; can be hidden/ignored if unused).
Note about vision/image: perception currently uses local ONNX models (OCR/object recognition) and selected audio pipelines. Multimodal LLMs (Ollama/vLLM) are supported in theory, but are not wired as the vision runtime yet.
User Query
โ
IntentManager (intent classification)
โ
Orchestrator (flow decision)
โ
โโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโ
โ Simple code โ Complex project โ Search โ
โ CODE_GENERATION โ COMPLEX_PLANNING โ RESEARCH โ
โโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโค
โ CoderAgent โ ArchitectAgent โ ResearcherAgent โ
โ โ โ โ โ โ โ
โ CriticAgent โ Plan creation โ WebSearchSkill โ
โ โ โ โ โ โ โ
โ Result โ Plan execution โ MemorySkill โ
โ โ (step by step) โ โ โ
โ โ โ โ Result โ
โ โ Result โ โ
โโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโ
๐ New web-next dashboard Detailed description of data sources for Brain/Strategy views and test checklist can be found in
docs/FRONTEND_NEXT_GUIDE.md. The document also defines entry criteria for the next stage of UI work. Chat session documentation, Direct/Normal/Complex modes and memory behavior:docs/CHAT_SESSION.md. Skills standards and MCP import:docs/DEV_GUIDE_SKILLS.md.
The new presentation layer runs on Next.js 15 (App Router, React 19). The interface consists of two types of components:
- SCC (Server/Client Components) โ by default we create server components (without
"use client"directive), and mark interactive fragments as client. Thanks to this, Brain/Strategy and Cockpit views can stream data without additional queries. - Shared layout (
components/layout/*) โ TopBar, Sidebar, bottom status bar and overlays share graphic tokens and translations (useTranslation).
# install dependencies
npm --prefix web-next install
# development environment (http://localhost:3000)
npm --prefix web-next run dev
# production build (generates meta versions + standalone)
npm --prefix web-next run build
# short E2E tests (Playwright, prod mode)
npm --prefix web-next run test:e2e
# validate translation consistency
npm --prefix web-next run lint:localesThe predev/prebuild script runs scripts/generate-meta.mjs, which saves public/meta.json (version + commit hash). All HTTP hooks use lib/api-client.ts; in local mode you can point to backend via variables:
NEXT_PUBLIC_API_BASE=http://localhost:8000
NEXT_PUBLIC_WS_BASE=ws://localhost:8000/ws/events
API_PROXY_TARGET=http://localhost:8000
Details (directory architecture, SCC guidelines, view data sources) are described in
docs/FRONTEND_NEXT_GUIDE.md.
Note: Cockpit now has two views โ / (production layout with selected boxes) and /chat (reference, full copy of previous layout).
- Force tool:
/<tool>(e.g./git,/web). - Force providers:
/gpt(OpenAI) and/gem(Gemini). - After detecting prefix, query content is cleaned of directive, and UI shows "Forced" label.
- UI language setting (PL/EN/DE) is passed as
preferred_languagein/api/v1/tasks. - Context summary strategy (
SUMMARY_STRATEGYin.env):llm_with_fallback(default, active model) orheuristic_only.
# Clone repository
git clone https://github.com/mpieniak01/Venom.git
cd Venom
# Install dependencies
pip install -r requirements.txt
# Configuration (copy .env.example to .env and fill in)
cp .env.example .envPython 3.10+ (recommended 3.11)
semantic-kernel>=1.9.0- Agent orchestrationddgs>=1.0- Search engine (successor to duckduckgo-search)trafilatura- Text extraction from web pagesbeautifulsoup4- HTML parsinglancedb- Vector database for memoryfastapi- API serverzeroconf- mDNS service discovery for local networkpynput- User action recording (THE_APPRENTICE)google-generativeai- Google Gemini (optional)openai/anthropic- LLM models (optional)
Full list in requirements.txt
Create .env file based on .env.example:
cp .env.example .envFull list of steps and deployment checklist can be found in docs/DEPLOYMENT_NEXT.md. Quick summary below:
# backend (uvicorn --reload) + web-next (next dev, turbopack off)
make start # alias make start-dev
# stop processes and clean ports 8000/3000
make stop
# PID status
make statusmake start-prod # build next + uvicorn without reload
make stop- backend runs on
http://localhost:8000(REST/SSE/WS), - Next.js serves UI on
http://localhost:3000.
Venom offers flexible modes for running components separately - ideal for development environments with limited resources (PC, laptop).
| Command | Description | Resource Usage | When to Use |
|---|---|---|---|
make api |
Backend (production, without auto-reload) | ~50 MB RAM, ~5% CPU | Working on frontend or not editing backend code |
make api-dev |
Backend (development, with auto-reload) | ~110 MB RAM, ~70% CPU (spikes) | Active work on backend code |
make api-stop |
Stop backend only | - | Frees port 8000 and backend memory |
make web |
Frontend (production build + start) | ~500 MB RAM, ~3% CPU | Demo or not editing UI |
make web-dev |
Frontend (dev server with auto-reload) | ~1.3 GB RAM, ~7% CPU | Active UI work |
make web-stop |
Stop frontend only | - | Frees port 3000 and frontend memory |
make vllm-start |
Start vLLM (local LLM model) | ~1.4 GB RAM, 13% RAM | Only when working with local models |
make vllm-stop |
Stop vLLM | - | Frees ~1.4 GB RAM |
make ollama-start |
Start Ollama | ~400 MB RAM | Alternative to vLLM |
make ollama-stop |
Stop Ollama | - | Frees Ollama memory |
Scenario 1: Working only on API (Light)
make api # Backend without auto-reload (~50 MB)
# Don't run web or LLM - save ~2.7 GB RAMScenario 2: Working on frontend
make api # Backend in background (stable, no reload)
make web-dev # Frontend with auto-reload for UI work
# Don't run LLM if not neededScenario 3: Full stack development
make api-dev # Backend with auto-reload
make web-dev # Frontend with auto-reload
make vllm-start # LLM only if working with local modelsScenario 4: Demo / presentation
make start-prod # Everything in production mode (lower CPU usage)Scenario 5: API testing only
make api # Backend without UI
curl http://localhost:8000/health-
VS Code Server: If working in CLI, close remote VS Code:
# From WSL/Linux pkill -f vscode-server # Or if using code tunnel code tunnel exit
-
Autoreload:
--reloadin uvicorn spawns an additional watcher process. Usemake apiinstead ofmake api-devwhen not editing backend code. -
Next.js dev:
next devuses ~1.3 GB RAM due to auto-reload. Usemake web(production) when only testing, not editing UI. -
LLM Environment: vLLM/Ollama use 1-2 GB RAM. Run them only when working with local models. In
AI_MODE=CLOUDmode they are not needed.
All data and tests are treated as local experiments โ Venom runs on user's private machine and we don't encrypt artifacts. Instead, directories with results (
**/test-results/,perf-artifacts/, Playwright/Locust reports) go to.gitignoreto avoid accidentally committing sensitive data. Transparency takes priority over formal "shadow data".
AI Configuration (hybrid engine):
# AI Mode: LOCAL (local only), HYBRID (mix), CLOUD (mainly cloud)
AI_MODE=LOCAL
# Local LLM (Ollama/vLLM)
LLM_SERVICE_TYPE=local
LLM_LOCAL_ENDPOINT=http://localhost:11434/v1
LLM_MODEL_NAME=llama3
# Cloud providers (optional, required for HYBRID/CLOUD)
GOOGLE_API_KEY=your_key_here
OPENAI_API_KEY=your_key_here
# Hybrid settings
HYBRID_CLOUD_PROVIDER=google # google or openai
HYBRID_LOCAL_MODEL=llama3
HYBRID_CLOUD_MODEL=gemini-1.5-pro
SENSITIVE_DATA_LOCAL_ONLY=true # Sensitive data ALWAYS localNetwork and Discovery (local first):
# mDNS (Zeroconf) for local network - venom.local
# NOTE: Cloudflare has been removed, we use local discoveryThe Hive (distributed processing):
ENABLE_HIVE=false
HIVE_URL=https://hive.example.com:8080
HIVE_REGISTRATION_TOKEN=your_token
REDIS_HOST=localhostThe Nexus (distributed mesh):
ENABLE_NEXUS=false
NEXUS_SHARED_TOKEN=your_secret_token
NEXUS_PORT=8765External Integrations:
GITHUB_TOKEN=ghp_your_token # Personal access token
GITHUB_REPO_NAME=username/repo # Repository name
DISCORD_WEBHOOK_URL=https://... # Optional
ENABLE_ISSUE_POLLING=false # Enable automatic issue polling๐ Full variable list: .env.example ๐ External integrations documentation: docs/EXTERNAL_INTEGRATIONS.md ๐ Hybrid AI engine documentation: docs/HYBRID_AI_ENGINE.md
Venom 2.0 introduces a graphical configuration panel available in the web interface at http://localhost:3000/config. The panel allows:
- Status monitoring - Backend, UI, LLM (Ollama/vLLM), Hive, Nexus, background tasks
- Process control - Start/stop/restart from UI without using terminal
- Real-time metrics - PID, port, CPU%, RAM, uptime, recent logs
- Quick profiles:
Full Stack- All services activeLight- Only Backend and UI (resource saving)LLM OFF- Everything except language models
The panel allows editing key runtime parameters from UI, with automatic:
- Range validation - Ports (1-65535), confidence thresholds (0.0-1.0), boolean values
- Secret masking - API keys, tokens, passwords are hidden by default
- Configuration backup - Automatic
.envbackup toconfig/env-history/before each change - Restart information - System indicates which services require restart after change
- AI Mode - AI mode, LLM endpoint, API keys, model routing
- Commands - Start/stop commands for Ollama and vLLM
- Hive - Redis configuration, queues, timeouts
- Nexus - Distributed mesh, port, tokens, heartbeat
- Tasks - Background tasks (documentation, cleanup, memory consolidation)
- Shadow - Desktop awareness, confidence thresholds, privacy filter
- Ghost - GUI automation, verification, safety delays
- Avatar - Audio interface, Whisper, TTS, VAD
- Parameter whitelist - Only defined parameters can be edited via UI
- Type and range validation - Checking value correctness before saving
- Dependency checking - System won't allow starting a service without meeting requirements (e.g. Nexus requires running backend)
- Change history - Each
.envmodification is saved with timestamp (last 50 backups kept)
Panel offers function to restore .env from earlier backups:
# Backups are located in:
config/env-history/.env-YYYYMMDD-HHMMSS๐ก Tip: Quick profiles are ideal for switching between work modes. Use
Lightduring development on laptop, andFull Stackon workstation with GPU.
Venom offers tools for quick diagnostics of system resource usage.
# Generate diagnostic report (processes, memory, CPU, service status)
make monitor
# Manual run
bash scripts/diagnostics/system_snapshot.shThe report will be saved to logs/diag-YYYYMMDD-HHMMSS.txt and contains:
- Uptime and load average
- Memory usage (free -h, /proc/meminfo)
- Top 15 processes (CPU and RAM)
- Venom process status (uvicorn, Next.js, vLLM, Ollama)
- PID files status and open ports (8000, 3000, 8001, 11434)
Example usage:
# Before starting work - check baseline
make monitor
# After starting services - compare usage
make api-dev
make web-dev
make monitor
# After finishing - make sure everything stopped
make stop
make monitorIf you run Venom in WSL (Windows Subsystem for Linux), you may encounter issues with vmmem - a Windows process that reserves a lot of RAM despite small Linux-side usage.
# Show detailed WSL memory statistics
bash scripts/wsl/memory_check.shThe script will display:
- Memory summary (free -h)
- Detailed info from /proc/meminfo
- Top 10 RAM-consuming processes
- Memory usage by individual Venom components
Symptom: Task Manager in Windows shows vmmem process occupying 20-30 GB RAM, even though free -h in WSL shows only 3-4 GB.
Cause: WSL doesn't return memory to Windows immediately. Cache and buffers are kept "just in case".
Solution:
-
Immediate: WSL memory reset
# From WSL (stops all Venom processes and executes shutdown) bash scripts/wsl/reset_memory.sh # OR from Windows (PowerShell/CMD) wsl --shutdown
-
Permanent: Limit usage via
.wslconfigCreate file
%USERPROFILE%\.wslconfig(e.g.C:\Users\YourName\.wslconfig):[wsl2] # Memory limit for WSL memory=12GB # Number of processors processors=4 # Swap limit swap=8GB
Available example with comments:
# See full configuration with examples cat scripts/wsl/wslconfig.example # Copy to Windows (from WSL) cp scripts/wsl/wslconfig.example /mnt/c/Users/YourName/.wslconfig
After saving
.wslconfigexecute:# From Windows (PowerShell/CMD) wsl --shutdown
Then restart WSL terminal.
PC with 16 GB RAM (economical):
[wsl2]
memory=8GB
processors=4
swap=4GBPC with 32 GB RAM (balanced):
[wsl2]
memory=12GB
processors=6
swap=8GBWorkstation with 64 GB RAM (performance):
[wsl2]
memory=32GB
processors=12
swap=16GB- Open Task Manager (Ctrl+Shift+Esc)
- "Details" or "Processes" tab
- Find "vmmem" process - this is memory used by WSL
- Compare with
free -hresults in WSL
If difference is significant (>50%), consider:
- Running
wsl --shutdownto free cache - Setting limits in
.wslconfig - Using Light profiles (
make apiinstead ofmake start-dev)
# Start server
uvicorn venom_core.main:app --reload
# Or use make
make run- System Architecture
- Backend Architecture
- Distributed Architecture (The Hive / Nexus)
- Intent Recognition System
- Hybrid AI Engine
- System Agents Catalog (34 agents) ๐
- Coding Agents Guidelines ๐งญ
- The Architect - Planning
- The Coder - Code Generation
- The Researcher - Knowledge Search
- The Chat - Conversational Assistant
- The Strategist - Complexity Analysis
- The Critic - Code Verification
- The Librarian - File Management
- The Integrator - Git & DevOps
- The Forge (Toolmaker) - Tool Creation
- Deployment (Next.js)
- External Integrations
- Guardian - Security
- QA Delivery
- Docker Minimal Packaging (sanity + publish)
- Docker Package Release Guide
- Windows WSL Install on D: (Docker Release)
Testing policy and commands are centralized in:
docs/TESTING_POLICY.mddocs/TESTING_CHAT_LATENCY.md(performance/latency details)
Quick local pre-PR path:
make pr-fastManual equivalent (if needed):
source .venv/bin/activate || true
pre-commit run --all-files
mypy venom_core
make check-new-code-coverageIf you want to run Venom from published images (without local build), use release compose:
git clone https://github.com/mpieniak01/Venom.git
cd Venom
# optional overrides:
# export BACKEND_IMAGE=ghcr.io/<owner>/venom-backend:v1.2.0
# export FRONTEND_IMAGE=ghcr.io/<owner>/venom-frontend:v1.2.0
# export OLLAMA_MODEL=gemma3:1b
scripts/docker/run-release.sh startCompose profiles in this repository:
compose/compose.release.yml- end-user profile (pulls prebuilt backend/frontend from GHCR).compose/compose.minimal.yml- developer profile (local build of backend/frontend).compose/compose.spores.yml.tmp- temporary Spore nodes draft; currently unused and intentionally not an active compose profile.
Useful commands:
scripts/docker/run-release.sh status
scripts/docker/run-release.sh restart
scripts/docker/run-release.sh stop
scripts/docker/logs.shOptional GPU mode:
export VENOM_ENABLE_GPU=auto # default; falls back to CPU if runtime is missing
scripts/docker/run-release.sh restart- SonarCloud (PR gate): every pull request is analyzed for bugs, vulnerabilities, code smells, duplications and maintainability issues.
- Snyk (periodic scan): dependency and container security scanning is executed periodically to catch newly disclosed CVEs.
- CI Lite: fast checks on every PR (lint + selected unit tests) to keep feedback loop short.
- Docker package flow:
docker-sanityvalidates builds on PR; package publishing (docker-publish) runs only onv*tags or manual trigger. - Docker Minimal network policy: LAN testing from another machine is supported by default; run only in trusted/private networks.
What this means for contributors and agents:
- Keep functions small and readable (avoid high cognitive complexity).
- Prefer explicit typing and pass
mypy venom_core. - Avoid unused blocks/imports and dead code.
- Treat warnings from
ruff,mypy, and Sonar as release blockers for new code.
# Installation
pip install pre-commit
pre-commit install
# Manual run
pre-commit run --all-filescd /home/ubuntu/venom
source .venv/bin/activate || true
# Ruff (linter + formatter)
ruff check . --fix
ruff format .
# isort (import sorting)
isort .
# mypy (type checking)
mypy venom_coreTools use the repo configuration (pyproject.toml) and skip data directories
such as models/ and models_cache/.
- Lines of code: 118,555 (non-empty lines; excluding
docs/,node_modules/,logs/,data/) - Number of agents: 33 (modules
venom_core/agents/*) - Number of skills: 19 executable (
venom_core/execution/skills/*) + 4 helper (Memory/Voice/Whisper/Core) - Number of tests: 518 (pytest
def test_) + 18 (Playwrighttest() - Test coverage: 65%
- Planning Layer (ArchitectAgent)
- Knowledge Expansion (ResearcherAgent + WebSearchSkill)
- Internet Integration
- Long-term memory
- Comprehensive tests
- NEW: External integrations (PlatformSkill) ๐ค
- GitHub integration (Issues, pull requests)
- Discord/Slack notifications
- Issue โ PR process
- Background polling for GitHub Issues
- Dashboard panel for external integrations
- Recursive summarization of long documents
- Search results cache
- Plan validation and optimization
- Better error recovery
- Webhook support for GitHub
- MS Teams integration
- Multi-source verification
- Google Search API integration
- Parallel plan step execution
- Plan cache for similar tasks
- GraphRAG integration
Contributions are welcome! See CONTRIBUTING.md to learn how to get started.
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit changes (
git commit -m 'feat: add new feature') - Push branch (
git push origin feature/amazing-feature) - Open a PR
- Code and comments: English or Polish
- Commit messages: Conventional Commits (feat, fix, docs, test, refactor)
- Style: Black + Ruff + isort (automatic via pre-commit)
- Tests: Required for new features
- Quality gates: SonarCloud must pass on PR; security baseline is continuously monitored with periodic Snyk scans
- Development Lead: mpieniak01
- Architecture: Venom Core Team
- Contributors: Contributors list
- Microsoft Semantic Kernel
- Microsoft AutoGen
- OpenAI / Anthropic / Google AI
- pytest
- Open Source Community
Venom - Autonomous AI agent system for next generation automation
๐ If you like the project, leave a star on GitHub!
Distributed under the MIT License. See LICENSE for more information.
Copyright (c) 2025-2026 Maciej Pieniak



