diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 2d0888f..b4f6041 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -18,23 +18,26 @@ Thank you for your interest in contributing to Deadend CLI! This document provid - **Python 3.11+** required - **Docker** - Required for running the pgvector database and sandbox execution -- **uv** - Package manager for dependency management +- **uv >= 0.5.30** - Package manager for dependency management - **Playwright** - For browser automation ### Setting Up Your Development Environment 1. **Fork and clone the repository**: + ```bash git clone https://github.com//deadend-cli.git cd deadend-cli ``` 2. **Install dependencies**: + ```bash uv sync ``` 3. **Install Playwright browsers**: + ```bash pipx install pytest-playwright playwright install @@ -140,7 +143,6 @@ class AgentOutput(BaseModel): updated_state: dict[str, Any] | None = None ``` - ### Conventions Summary - **Confidence scores**: Always 0.0 to 1.0 (float), not percentages @@ -204,6 +206,7 @@ async def test_my_async_function(): ### Pull Request Process 1. **Create a branch**: + ```bash git checkout -b feature/your-feature-name ``` @@ -211,6 +214,7 @@ async def test_my_async_function(): 2. **Make your changes** following the code style guidelines 3. **Run tests and formatting**: + ```bash black . isort . @@ -219,12 +223,14 @@ async def test_my_async_function(): ``` 4. **Commit your changes**: + ```bash git add . git commit -m "Add: brief description of changes" ``` 5. **Push and create a PR**: + ```bash git push origin feature/your-feature-name ``` diff --git a/deadend_cli/README.md b/deadend_cli/README.md index e168ee8..96973bf 100644 --- a/deadend_cli/README.md +++ b/deadend_cli/README.md @@ -15,6 +15,7 @@ Achieves ~78% on XBOW benchmarks with fully local execution and model-agnostic a Deadend CLI is an autonomous web application penetration testing agent that uses feedback-driven iteration to adapt exploitation strategies. When standard tools fail, it generates custom Python payloads, observes responses, and iteratively refines its approach until breakthrough. **Key features:** + - Fully local execution (no cloud dependencies, zero data exfiltration) - Model-agnostic design (works with any deployable LLM) - Custom sandboxed tools (Playwright, Docker, WebAssembly) @@ -51,11 +52,14 @@ The framework focuses on **intelligent security analysis** through: ## Quick Start ### Prerequisites + - Docker (required) - Python 3.11+ +- uv >= 0.5.30 - Playwright: `playwright install` ### Installation + ```bash # Install via pipx (recommended) pipx install deadend_cli @@ -67,6 +71,7 @@ uv sync && uv build ``` ### First Run + ```bash # Initialize configuration deadend-cli init @@ -82,6 +87,7 @@ deadend-cli chat \ ## Usage Examples ### Basic Vulnerability Testing + ```bash # Test OWASP Juice Shop docker run -p 3000:3000 bkimminich/juice-shop @@ -92,6 +98,7 @@ deadend-cli chat \ ``` ### API Security Testing + ```bash deadend-cli chat \ --target "https://api.example.com" \ @@ -99,6 +106,7 @@ deadend-cli chat \ ``` ### Autonomous Mode + ```bash # Run without approval prompts (CTFs/labs only) deadend-cli chat \ @@ -112,21 +120,27 @@ deadend-cli chat \ ## Commands ### `deadend-cli init` + Initialize configuration and set up pgvector database ### `deadend-cli chat` + Start interactive security testing session + - `--target`: Target URL - `--prompt`: Initial testing prompt - `--mode`: `hacker` (approval required) or `yolo` (autonomous) ### `deadend-cli eval-agent` + Run evaluation against challenge datasets + - `--eval-metadata-file`: Challenge dataset file - `--llm-providers`: AI model providers to test - `--guided`: Run with subtask decomposition ### `deadend-cli version` + Display current version --- @@ -149,12 +163,12 @@ The agent uses a two-phase approach (reconnaissance → exploitation) with a sup Evaluated on XBOW's 104-challenge validation suite (black-box mode, January 2026): -| Agent | Success Rate | Infrastructure | Blind SQLi | -|-------|-------------|----------------|------------| -| XBOW (proprietary) | 85% | Proprietary | ? | -| Cyber-AutoAgent | 81% | AWS Bedrock | 0% | -| **Deadend CLI** | **78%** | **Fully local** | **33%** | -| MAPTA | 76.9% | External APIs | 0% | +| Agent | Success Rate | Infrastructure | Blind SQLi | +| ------------------ | ------------ | --------------- | ---------- | +| XBOW (proprietary) | 85% | Proprietary | ? | +| Cyber-AutoAgent | 81% | AWS Bedrock | 0% | +| **Deadend CLI** | **78%** | **Fully local** | **33%** | +| MAPTA | 76.9% | External APIs | 0% | **Models tested:** Claude Sonnet 4.5 (~78%), Kimi K2 Thinking (~69%) @@ -166,11 +180,13 @@ Perfect scores: GraphQL, SSRF, NoSQL injection, HTTP method tampering (100%) ## Operating Modes **Hacker Mode (default):** Requires approval for dangerous operations + ```bash deadend-cli chat --target URL --mode hacker ``` **YOLO Mode:** Autonomous execution (CTFs/labs only) + ```bash deadend-cli chat --target URL --mode yolo ``` @@ -197,6 +213,7 @@ Configuration is managed via `~/.cache/deadend/config.toml`. Run `deadend-cli in ## Current Status & Roadmap ### Stable (v0.0.15) + ✅ New architecture ✅ XBOW benchmark evaluation (78%) ✅ Custom sandboxed tools @@ -204,7 +221,9 @@ Configuration is managed via `~/.cache/deadend/config.toml`. Run `deadend-cli in ✅ Two-phase execution (recon + exploitation) ### In Progress (v0.1.0) + 🚧 **CLI Redesign** with enhanced workflows: + - Plan mode (review strategies before execution) - Preset configuration workflows (API testing, web apps, auth bypass) - Workflow automation (save/replay attack chains) @@ -212,8 +231,8 @@ Configuration is managed via `~/.cache/deadend/config.toml`. Run `deadend-cli in 🚧 Context optimization (reduce redundant tool calls) 🚧 Secrets management improvements - ### Future roadmap + The current architecture proves competitive autonomous pentesting (78%) is achievable without cloud dependencies. Next challenges: - **Open-Source Models**: Achieve 75%+ with Llama/Qwen (eliminate proprietary dependencies) @@ -229,6 +248,7 @@ Goal: Make autonomous pentesting accessible (open models), comprehensive (hybrid ## Contributing Contributions welcome in: + - Context optimization algorithms - Vulnerability test cases - Open-weight model fine-tuning @@ -239,6 +259,7 @@ See [CONTRIBUTING.md](../CONTRIBUTING.md) for guidelines on how to contribute. --- ## Citation + ```bibtex @software{deadend_cli_2026, author = {Yassine Bargach}, diff --git a/deadend_cli/deadend_agent/src/deadend_agent/core.py b/deadend_cli/deadend_agent/src/deadend_agent/core.py index 9fd45de..55356e3 100644 --- a/deadend_cli/deadend_agent/src/deadend_agent/core.py +++ b/deadend_cli/deadend_agent/src/deadend_agent/core.py @@ -11,6 +11,7 @@ from pathlib import Path import hashlib +import platform import subprocess import requests from deadend_agent.config.settings import Config @@ -19,12 +20,29 @@ from deadend_agent.rag.db_cruds import RetrievalDatabaseConnector -PYTHON_SANDBOX_NAME = "python-sandbox-tool-linux" -SIMPLE_PYTHON_SANDBOX_URL = ( - "https://github.com/xoxruns/simple-python-interpreter-sandbox/" - "releases/download/v0.0.3/python-sandbox-tool-linux" -) -PYTHON_SANDBOX_SHA256 = "74b8a80709a912028600f39b9953889c011278a80acf066af5bd6979366455f4" +# Platform-specific sandbox binary configurations +SANDBOX_CONFIGS = { + "Linux": { + "name": "python-sandbox-tool-linux", + "url": "https://github.com/xoxruns/simple-python-interpreter-sandbox/releases/download/v0.0.3/python-sandbox-tool-linux", + "sha256": "74b8a80709a912028600f39b9953889c011278a80acf066af5bd6979366455f4", + }, + "Darwin": { + "name": "python-sandbox-tool-macos", + "url": "https://github.com/xoxruns/simple-python-interpreter-sandbox/releases/download/v0.0.3/python-sandbox-tool-macos", + "sha256": "9dc49652b1314978544e3e56eef67610d10a2fbb51ecaf06bc10f9c27ad75d7c", + }, +} + + +def get_sandbox_config(): + """Get the sandbox configuration for the current platform.""" + system = platform.system() + if system not in SANDBOX_CONFIGS: + raise RuntimeError( + f"Unsupported platform: {system}. Supported platforms: {', '.join(SANDBOX_CONFIGS.keys())}" + ) + return SANDBOX_CONFIGS[system] def config_setup() -> Config: """Setup config""" @@ -46,9 +64,10 @@ def sandbox_setup() -> SandboxManager: sandbox_manager = SandboxManager() return sandbox_manager -def setup_model_registry(config: Config) -> ModelRegistry: +async def setup_model_registry(config: Config) -> ModelRegistry: """Setup Model registry""" model_registry = ModelRegistry(config=config) + await model_registry.initialize() return model_registry def _file_matches_sha256(path: Path, expected_hash: str) -> bool: @@ -65,36 +84,35 @@ def _file_matches_sha256(path: Path, expected_hash: str) -> bool: def download_python_sandbox( destination_dir: Path | None = None, - expected_sha256: str = PYTHON_SANDBOX_SHA256, ) -> Path: """Download the Python sandbox binary to the local cache if missing or outdated. Args: destination_dir: Optional directory to store the sandbox binary. Defaults to ~/.cache/deadend/python/. - expected_sha256: Expected SHA-256 checksum of the binary. Returns: Path to the downloaded (or existing) sandbox binary. """ + config = get_sandbox_config() cache_dir = destination_dir or Path.home() / ".cache" / "deadend" / "python" cache_dir.mkdir(parents=True, exist_ok=True) - sandbox_path = cache_dir / PYTHON_SANDBOX_NAME + sandbox_path = cache_dir / config["name"] - if _file_matches_sha256(sandbox_path, expected_sha256): + if _file_matches_sha256(sandbox_path, config["sha256"]): return sandbox_path if sandbox_path.exists(): sandbox_path.unlink() - response = requests.get(SIMPLE_PYTHON_SANDBOX_URL, stream=True, timeout=120) + response = requests.get(config["url"], stream=True, timeout=120) response.raise_for_status() with open(sandbox_path, "wb") as fd: for chunk in response.iter_content(chunk_size=8192): if chunk: fd.write(chunk) - if not _file_matches_sha256(sandbox_path, expected_sha256): + if not _file_matches_sha256(sandbox_path, config["sha256"]): sandbox_path.unlink(missing_ok=True) raise RuntimeError( "Downloaded Python sandbox binary failed checksum verification." diff --git a/deadend_cli/deadend_agent/src/deadend_agent/models/registry.py b/deadend_cli/deadend_agent/src/deadend_agent/models/registry.py index 8169355..89f38c4 100644 --- a/deadend_cli/deadend_agent/src/deadend_agent/models/registry.py +++ b/deadend_cli/deadend_agent/src/deadend_agent/models/registry.py @@ -9,7 +9,7 @@ initialization, and provider-specific model abstractions. """ -from typing import Dict +from typing import Dict, Optional import aiohttp from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.models.anthropic import AnthropicModel @@ -33,23 +33,25 @@ class EmbedderClient: """Client for generating embeddings using various embedding API providers. - + This class provides a unified interface for embedding generation across different providers (OpenAI, OpenRouter, etc.) by abstracting the API communication and response parsing. - + Attributes: model: Name of the embedding model to use. api_key: API key for authenticating with the embedding service. base_url: Base URL for the embedding API endpoint. + _session: Shared aiohttp ClientSession for connection reuse. """ model: str api_key: str base_url: str + _session: Optional[aiohttp.ClientSession] def __init__(self, model_name: str, api_key: str, base_url: str) -> None: """Initialize the EmbedderClient with provider configuration. - + Args: model_name: Name of the embedding model to use (e.g., "text-embedding-3-small"). api_key: API key for authenticating with the embedding service. @@ -58,94 +60,134 @@ def __init__(self, model_name: str, api_key: str, base_url: str) -> None: self.model = model_name self.api_key = api_key self.base_url = base_url + self._session = None + + async def initialize(self) -> None: + """Initialize the shared ClientSession for HTTP requests. + + Creates a persistent aiohttp ClientSession that will be reused + across all embedding requests to avoid resource exhaustion from + creating too many concurrent connections. + """ + if self._session is None: + self._session = aiohttp.ClientSession() + + async def close(self) -> None: + """Close the shared ClientSession and cleanup resources. + + Should be called when the EmbedderClient is no longer needed + to properly release HTTP connection resources. + """ + if self._session is not None: + await self._session.close() + self._session = None async def batch_embed(self, input: list) -> list: """Generate embeddings for a batch of input texts. - + Sends a batch embedding request to the configured API endpoint and handles various response formats. Supports OpenAI-compatible APIs and other providers with different response structures. - + Args: input: List of text strings to embed. Each string will be embedded into a vector representation. - + Returns: List of embedding dictionaries. Each dictionary contains an 'embedding' key with the vector representation. Returns empty list if no embeddings were generated. - + Raises: ValueError: If the API returns a non-200 status code, an error response, or an unexpected response structure. + RuntimeError: If the session has not been initialized. """ - async with aiohttp.ClientSession() as session: - response = await session.post( - url=self.base_url, - headers={ - "Authorization": f"Bearer {self.api_key}", - "Content-Type": "application/json", - }, - json={ - "model": self.model, - "input": input - } - ) - - # Check HTTP status code - if response.status != 200: - error_text = await response.text() - raise ValueError(f"Embedding API returned status {response.status}: {error_text}") - - data = await response.json() - - # Handle different response structures - # OpenAI format: {"data": [{"embedding": [...]}, ...]} - # Some APIs might return the data directly or in a different structure - if isinstance(data, dict) and 'data' in data: - embeddings = data['data'] - elif isinstance(data, list): - # Response is already a list of embeddings - embeddings = data - elif isinstance(data, dict) and 'error' in data: - # API returned an error - error_info = data.get('error', {}) - error_msg = error_info.get('message', str(error_info)) if isinstance(error_info, dict) else str(error_info) - raise ValueError(f"Embedding API error: {error_msg}") - else: - # Try to find embeddings in the response - error_msg = f"Unexpected response structure: {list(data.keys()) if isinstance(data, dict) else type(data)}" - raise ValueError(error_msg) + if self._session is None: + raise RuntimeError("EmbedderClient session not initialized. Call initialize() first.") + + response = await self._session.post( + url=self.base_url, + headers={ + "Authorization": f"Bearer {self.api_key}", + "Content-Type": "application/json", + }, + json={ + "model": self.model, + "input": input + } + ) + + # Check HTTP status code + if response.status != 200: + error_text = await response.text() + raise ValueError(f"Embedding API returned status {response.status}: {error_text}") + + data = await response.json() + + # Handle different response structures + # OpenAI format: {"data": [{"embedding": [...]}, ...]} + # Some APIs might return the data directly or in a different structure + if isinstance(data, dict) and 'data' in data: + embeddings = data['data'] + elif isinstance(data, list): + # Response is already a list of embeddings + embeddings = data + elif isinstance(data, dict) and 'error' in data: + # API returned an error + error_info = data.get('error', {}) + error_msg = error_info.get('message', str(error_info)) if isinstance(error_info, dict) else str(error_info) + raise ValueError(f"Embedding API error: {error_msg}") + else: + # Try to find embeddings in the response + error_msg = f"Unexpected response structure: {list(data.keys()) if isinstance(data, dict) else type(data)}" + raise ValueError(error_msg) return embeddings if embeddings else [] class ModelRegistry: """Registry for managing AI model instances from multiple providers. - + This class initializes and manages access to language models from various providers (OpenAI, Anthropic, Google/Gemini, OpenRouter) based on configuration settings. It also manages the embedding client for generating vector embeddings. - + Attributes: embedder_model: Embedding client instance, or None if not initialized. + _initialized: Flag indicating whether async initialization is complete. """ embedder_model: EmbedderClient | None + _initialized: bool def __init__(self, config: Config): """Initialize the ModelRegistry with configuration. - + Reads model settings from the provided configuration and initializes model instances for all configured providers. Also sets up the embedding client based on the first available provider configuration. - + + Note: After creating ModelRegistry, you must call initialize() before + using the embedder client. + Args: config: Configuration object containing API keys and model settings for various providers. """ self._models: Dict[str, AIModel] = {} + self._initialized = False self._initialize_models(config=config) + async def initialize(self) -> None: + """Initialize async resources like the embedder ClientSession. + + Must be called after __init__ and before using the embedder client. + This is a separate method because __init__ cannot be async. + """ + if not self._initialized and self.embedder_model is not None: + await self.embedder_model.initialize() + self._initialized = True + def _initialize_models(self, config: Config): """Initialize model instances for all configured providers. diff --git a/deadend_cli/deadend_agent/src/deadend_agent/rag/database.py b/deadend_cli/deadend_agent/src/deadend_agent/rag/database.py index 285faa7..561cfda 100644 --- a/deadend_cli/deadend_agent/src/deadend_agent/rag/database.py +++ b/deadend_cli/deadend_agent/src/deadend_agent/rag/database.py @@ -74,12 +74,12 @@ class RagDeps: @dataclass class CodeSection: """Represents a code section with metadata and embeddings. - + Attributes: url_path: URL or file path where the code section is located. title: Descriptive title for the code section. content: Dictionary containing the actual code content. - embeddings: Vector embeddings for semantic search (4096 dimensions). + embeddings: Vector embeddings for semantic search (1536 dimensions for OpenAI models). """ url_path: str title: str @@ -137,8 +137,8 @@ async def embed_content( url text NOT NULL, title text NOT NULL, content text NOT NULL, - -- returns a vector of 4096 floats - embedding vector(4096) NOT NULL + -- 1536 dimensions matches OpenAI text-embedding-3-small and text-embedding-ada-002 + embedding vector(1536) NOT NULL ); CREATE INDEX IF NOT EXISTS idx_code_sections_embedding ON code_sections USING hnsw (embedding vector_l2_ops); """ diff --git a/deadend_cli/deadend_agent/src/deadend_agent/rag/db_cruds.py b/deadend_cli/deadend_agent/src/deadend_agent/rag/db_cruds.py index ff13fa0..370e429 100644 --- a/deadend_cli/deadend_agent/src/deadend_agent/rag/db_cruds.py +++ b/deadend_cli/deadend_agent/src/deadend_agent/rag/db_cruds.py @@ -14,6 +14,7 @@ from datetime import datetime from typing import List, Optional, Dict, Any, AsyncGenerator from contextlib import asynccontextmanager +from urllib.parse import urlparse # import numpy as np from sqlalchemy.ext.asyncio import create_async_engine, AsyncSession, async_sessionmaker from sqlalchemy import text, select @@ -28,12 +29,20 @@ def __init__(self, database_url: str, pool_size: int = 20, max_overflow: int = 3 if database_url.startswith("postgresql://"): database_url = database_url.replace("postgresql://", "postgresql+asyncpg://", 1) + # Disable SSL for localhost connections to fix macOS asyncpg issues + # asyncpg requires ssl=False instead of sslmode URL parameter + parsed = urlparse(database_url) + connect_args = {} + if parsed.hostname in ('localhost', '127.0.0.1', '::1'): + connect_args['ssl'] = False + self.engine = create_async_engine( database_url, pool_size=pool_size, max_overflow=max_overflow, pool_pre_ping=True, - echo=False # Set to True for SQL debugging + echo=False, # Set to True for SQL debugging + connect_args=connect_args ) self.async_session = async_sessionmaker( diff --git a/deadend_cli/deadend_agent/src/deadend_agent/rag/models.py b/deadend_cli/deadend_agent/src/deadend_agent/rag/models.py index 08054aa..330f7b9 100644 --- a/deadend_cli/deadend_agent/src/deadend_agent/rag/models.py +++ b/deadend_cli/deadend_agent/src/deadend_agent/rag/models.py @@ -34,7 +34,8 @@ class CodeChunk(Base): language = Column(String(50), nullable=False) # start_line = Column(Integer, nullable=True) # end_line = Column(Integer, nullable=True) - embedding = Column(Vector(4096), nullable=False) + # 1536 dimensions matches OpenAI text-embedding-3-small and text-embedding-ada-002 + embedding = Column(Vector(1536), nullable=False) # Metadata created_at = Column(DateTime, default=datetime.now()) updated_at = Column(DateTime, default=datetime.now(), onupdate=datetime.now()) @@ -57,7 +58,8 @@ class CodebaseChunk(Base): struct_name = Column(String(200), nullable=True) language = Column(String(50), nullable=False) code_content = Column(Text, nullable=False) - embedding = Column(Vector(4096), nullable=False) + # 1536 dimensions matches OpenAI text-embedding-3-small and text-embedding-ada-002 + embedding = Column(Vector(1536), nullable=False) # metadata created_at = Column(DateTime, default=datetime.now()) updated_at = Column(DateTime, default=datetime.now(), onupdate=datetime.now()) @@ -78,7 +80,8 @@ class KnowledgeBase(Base): file_path = Column(String(500), nullable=False) content_metadata = Column(Text, nullable=False) content = Column(Text, nullable=False) - embedding = Column(Vector(4096), nullable=False) + # 1536 dimensions matches OpenAI text-embedding-3-small and text-embedding-ada-002 + embedding = Column(Vector(1536), nullable=False) # metadata created_at = Column(DateTime, default=datetime.now()) updated_at = Column(DateTime, default=datetime.now(), onupdate=datetime.now()) diff --git a/deadend_cli/deadend_agent/src/deadend_agent/tools/browser_automation/http_parser.py b/deadend_cli/deadend_agent/src/deadend_agent/tools/browser_automation/http_parser.py index 8f7dc16..a885bc3 100644 --- a/deadend_cli/deadend_agent/src/deadend_agent/tools/browser_automation/http_parser.py +++ b/deadend_cli/deadend_agent/src/deadend_agent/tools/browser_automation/http_parser.py @@ -215,25 +215,20 @@ def analyze_http_request_text(raw_request_text: str) -> tuple[bool, dict]: def extract_host_port(target_host: str) -> Tuple[str, int]: """Extract host and port from a URL string using urllib.parse.urlparse""" - if target_host.startswith("http://"): - default_port = 80 - elif target_host.startswith("https://"): - default_port = 443 - else: - default_port = 80 - - parts = target_host.split(":") - if len(parts) >= 2: - try: - port_int = int(parts[-1]) - host = ":".join(parts[:-1]) - return host, port_int - except ValueError: - host = target_host - return host, default_port - else: - host = target_host - return host, default_port + # If no scheme, add one for parsing + if not target_host.startswith(('http://', 'https://')): + target_host = f"http://{target_host}" + + # Parse the URL properly + parsed = urlparse(target_host) + host = parsed.hostname or 'localhost' + port = parsed.port + + # If no port specified, use default based on scheme + if port is None: + port = 443 if parsed.scheme == 'https' else 80 + + return host, port import re diff --git a/deadend_cli/deadend_agent/tests/deadend_sdk/tools/browser_automation/test_extract_host_port.py b/deadend_cli/deadend_agent/tests/deadend_sdk/tools/browser_automation/test_extract_host_port.py new file mode 100644 index 0000000..876ddca --- /dev/null +++ b/deadend_cli/deadend_agent/tests/deadend_sdk/tools/browser_automation/test_extract_host_port.py @@ -0,0 +1,109 @@ +""" +Unit tests for extract_host_port functionality. +""" +import pytest +from deadend_agent.tools.browser_automation.http_parser import extract_host_port + + +class TestExtractHostPort: + """Tests for extract_host_port function.""" + + def test_http_url_with_port(self): + """HTTP URL with port should extract host and port correctly.""" + host, port = extract_host_port("http://localhost:3000") + assert host == "localhost" + assert port == 3000 + + def test_https_url_with_port(self): + """HTTPS URL with port should extract host and port correctly.""" + host, port = extract_host_port("https://localhost:3000") + assert host == "localhost" + assert port == 3000 + + def test_host_with_port_no_protocol(self): + """Host:port without protocol should extract correctly.""" + host, port = extract_host_port("localhost:3000") + assert host == "localhost" + assert port == 3000 + + def test_http_url_with_custom_port(self): + """HTTP URL with custom port should extract correctly.""" + host, port = extract_host_port("http://example.com:8080") + assert host == "example.com" + assert port == 8080 + + def test_https_url_with_standard_port(self): + """HTTPS URL with standard port 443 should extract correctly.""" + host, port = extract_host_port("https://example.com:443") + assert host == "example.com" + assert port == 443 + + def test_http_url_no_port(self): + """HTTP URL without port should default to 80.""" + host, port = extract_host_port("http://example.com") + assert host == "example.com" + assert port == 80 + + def test_https_url_no_port(self): + """HTTPS URL without port should default to 443.""" + host, port = extract_host_port("https://example.com") + assert host == "example.com" + assert port == 443 + + def test_bare_hostname(self): + """Bare hostname without protocol should default to port 80.""" + host, port = extract_host_port("example.com") + assert host == "example.com" + assert port == 80 + + def test_bare_localhost(self): + """Bare localhost without port should default to port 80.""" + host, port = extract_host_port("localhost") + assert host == "localhost" + assert port == 80 + + def test_ip_address_with_port(self): + """IP address with port should extract correctly.""" + host, port = extract_host_port("127.0.0.1:8000") + assert host == "127.0.0.1" + assert port == 8000 + + def test_http_ip_address_with_port(self): + """HTTP URL with IP address and port should extract correctly.""" + host, port = extract_host_port("http://127.0.0.1:8000") + assert host == "127.0.0.1" + assert port == 8000 + + def test_url_with_path_ignored(self): + """URL with path should ignore the path and extract host:port.""" + host, port = extract_host_port("http://example.com:8080/api/v1") + assert host == "example.com" + assert port == 8080 + + def test_url_with_query_ignored(self): + """URL with query params should ignore them and extract host:port.""" + host, port = extract_host_port("http://example.com:8080?param=value") + assert host == "example.com" + assert port == 8080 + + def test_url_reconstruction_no_duplicate_protocol(self): + """ + Test that extract_host_port prevents protocol duplication in URL construction. + This is the bug we're fixing: http://http://localhost:3000 should not happen. + """ + # Input with protocol + host, port = extract_host_port("http://localhost:3000") + # Reconstruct URL (simulating pw_requester.py behavior) + reconstructed_url = f"http://{host}:{port}/path" + + # Should NOT have duplicate protocol + assert reconstructed_url == "http://localhost:3000/path" + assert "http://http://" not in reconstructed_url + + def test_https_url_reconstruction_no_duplicate_protocol(self): + """Test HTTPS URL reconstruction doesn't duplicate protocol.""" + host, port = extract_host_port("https://example.com:443") + reconstructed_url = f"https://{host}:{port}/api" + + assert reconstructed_url == "https://example.com:443/api" + assert "https://https://" not in reconstructed_url diff --git a/deadend_cli/src/deadend_cli/chat.py b/deadend_cli/src/deadend_cli/chat.py index 17c6f77..b317662 100644 --- a/deadend_cli/src/deadend_cli/chat.py +++ b/deadend_cli/src/deadend_cli/chat.py @@ -361,6 +361,7 @@ async def chat_interface( ): """Chat Interface for the CLI""" model_registry = ModelRegistry(config=config) + await model_registry.initialize() if not model_registry.has_any_model(): raise RuntimeError(f"No LM model configured. You can run `deadend init` to \ initialize the required Model configuration for {llm_provider}") diff --git a/deadend_cli/src/deadend_cli/cli.py b/deadend_cli/src/deadend_cli/cli.py index e2ac786..c07ac76 100644 --- a/deadend_cli/src/deadend_cli/cli.py +++ b/deadend_cli/src/deadend_cli/cli.py @@ -6,27 +6,37 @@ Defines commands to run interactive chat and evaluation agents. """ -import importlib.metadata import asyncio +import importlib.metadata +import os from typing import List -import typer + import docker +import logfire +import typer from docker.errors import DockerException from rich.console import Console -import logfire + +# Fix Docker socket path if default doesn't exist +if not os.path.exists("/var/run/docker.sock"): + docker_socket = os.path.expanduser("~/.docker/run/docker.sock") + if os.path.exists(docker_socket): + os.environ["DOCKER_HOST"] = f"unix://{docker_socket}" from deadend_agent import config_setup from deadend_agent.core import start_python_sandbox -from .chat import chat_interface, Modes -from .eval import eval_interface + from .banner import print_banner -from .init import init_cli_config, check_docker, \ - check_pgvector_container, stop_pgvector_container, setup_pgvector_database +from .chat import Modes, chat_interface +from .eval import eval_interface +from .init import (check_docker, check_pgvector_container, init_cli_config, + setup_pgvector_database, stop_pgvector_container) console = Console() app = typer.Typer(help="Deadend CLI - interact with the Deadend framework.") + @app.command() def version(): """Show the version of the Deadend framework.""" @@ -34,17 +44,23 @@ def version(): package_version = importlib.metadata.version("deadend_cli") console.print(f"[bold green]Deadend CLI v{package_version}[/bold green]") except importlib.metadata.PackageNotFoundError: - console.print("[bold red]Deadend CLI[/bold red] - [yellow]Version not available[/yellow]") + console.print( + "[bold red]Deadend CLI[/bold red] - [yellow]Version not available[/yellow]" + ) @app.command() def chat( prompt: str = typer.Option(None, help="Send a prompt directly to chat mode."), target: str = typer.Option(None, help="Target URL or identifier for chat."), - mode: Modes = typer.Option(Modes.hacker, help="Two modes available, yolo and hacker."), - openapi_spec: str = typer.Option(None, help="Path to the OpenAPI specification file."), - knowledge_base: str = typer.Option(None, help="Folder path to the knowledge base.") - ): + mode: Modes = typer.Option( + Modes.hacker, help="Two modes available, yolo and hacker." + ), + openapi_spec: str = typer.Option( + None, help="Path to the OpenAPI specification file." + ), + knowledge_base: str = typer.Option(None, help="Folder path to the knowledge base."), +): """Run the interactive chat agent. Args: @@ -55,9 +71,13 @@ def chat( # Check Docker availability first docker_client = docker.from_env() if not check_docker(docker_client): - console.print("\n[red]Docker is required for this application to function properly.[/red]") + console.print( + "\n[red]Docker is required for this application to function properly.[/red]" + ) console.print("Please install Docker from: https://docs.docker.com/get-docker/") - console.print("Make sure Docker daemon is running, then run this command again.") + console.print( + "Make sure Docker daemon is running, then run this command again." + ) raise typer.Exit(1) # Check pgvector database and setup if not running @@ -84,7 +104,7 @@ def chat( mode=mode, target=target, openapi_spec=openapi_spec, - knowledge_base=knowledge_base + knowledge_base=knowledge_base, ) ) finally: @@ -94,18 +114,24 @@ def chat( try: stop_pgvector_container(docker_client) except (DockerException, OSError, ConnectionError) as e: - console.print(f"[yellow]Warning: Could not stop pgvector container: {e}[/yellow]") + console.print( + f"[yellow]Warning: Could not stop pgvector container: {e}[/yellow]" + ) @app.command() def eval_agent( eval_metadata_file: str = typer.Option( None, - help="Dataset file containing all the information about the challenges to run" + help="Dataset file containing all the information about the challenges to run", ), - llm_providers: List[str] = typer.Option(['openai'], help="Specify the eval providers"), - guided: bool = typer.Option(False, help="Run subtasks instead of one general task.") - ): + llm_providers: List[str] = typer.Option( + ["openai"], help="Specify the eval providers" + ), + guided: bool = typer.Option( + False, help="Run subtasks instead of one general task." + ), +): """Run the evaluation agent on a dataset of challenges. Args: @@ -117,9 +143,13 @@ def eval_agent( # Check Docker availability first docker_client = docker.from_env() if not check_docker(docker_client): - console.print("\n[red]Docker is required for this application to function properly.[/red]") + console.print( + "\n[red]Docker is required for this application to function properly.[/red]" + ) console.print("Please install Docker from: https://docs.docker.com/get-docker/") - console.print("Make sure Docker daemon is running, then run this command again.") + console.print( + "Make sure Docker daemon is running, then run this command again." + ) raise typer.Exit(1) # Check pgvector database and setup if not running @@ -140,7 +170,7 @@ def eval_agent( config=config, eval_metadata_file=eval_metadata_file, providers=llm_providers, - guided=guided + guided=guided, ) ) finally: @@ -150,7 +180,10 @@ def eval_agent( try: stop_pgvector_container(docker_client) except (DockerException, OSError, ConnectionError) as e: - console.print(f"[yellow]Warning: Could not stop pgvector container: {e}[/yellow]") + console.print( + f"[yellow]Warning: Could not stop pgvector container: {e}[/yellow]" + ) + @app.command() def init(): diff --git a/deadend_cli/src/deadend_cli/eval.py b/deadend_cli/src/deadend_cli/eval.py index ff321a1..61a4a83 100644 --- a/deadend_cli/src/deadend_cli/eval.py +++ b/deadend_cli/src/deadend_cli/eval.py @@ -67,6 +67,7 @@ async def eval_interface( eval_metadata = EvalMetadata(**data) model_registry = ModelRegistry(config=config) + await model_registry.initialize() if not model_registry.has_any_model(): raise RuntimeError(f"No LM model configured. You can run `deadend init` to \ initialize the required Model configuration for {providers[0]}") diff --git a/deadend_cli/src/deadend_cli/init.py b/deadend_cli/src/deadend_cli/init.py index 9b96597..4f94402 100644 --- a/deadend_cli/src/deadend_cli/init.py +++ b/deadend_cli/src/deadend_cli/init.py @@ -11,9 +11,10 @@ import os import time from pathlib import Path + +import docker import toml import typer -import docker from docker.errors import DockerException, NotFound from rich.console import Console @@ -22,10 +23,10 @@ def check_docker(client: docker.DockerClient) -> bool: """Check if Docker daemon is running using the Docker Python API. - + Args: client: Docker client instance - + Returns: bool: True if Docker daemon is available and running, False otherwise """ @@ -45,10 +46,10 @@ def check_docker(client: docker.DockerClient) -> bool: def check_pgvector_container(client: docker.DockerClient) -> bool: """Check if pgvector container is running. - + Args: client: Docker client instance - + Returns: bool: True if pgvector container is running, False otherwise """ @@ -58,16 +59,18 @@ def check_pgvector_container(client: docker.DockerClient) -> bool: except NotFound: return False except DockerException as e: - console.print(f"[yellow]Warning: Could not check pgvector container status: {e}[/yellow]") + console.print( + f"[yellow]Warning: Could not check pgvector container status: {e}[/yellow]" + ) return False def setup_pgvector_database(client: docker.DockerClient) -> bool: """Setup pgvector database using Docker API. - + Args: client: Docker client instance - + Returns: bool: True if setup successful, False otherwise """ @@ -79,7 +82,9 @@ def setup_pgvector_database(client: docker.DockerClient) -> bool: console.print("[green]pgvector database is already running.[/green]") return True else: - console.print("[yellow]Found existing pgvector container, starting it...[/yellow]") + console.print( + "[yellow]Found existing pgvector container, starting it...[/yellow]" + ) existing_container.start() # Wait for container to be ready time.sleep(5) @@ -105,13 +110,18 @@ def setup_pgvector_database(client: docker.DockerClient) -> bool: name="deadend_pg", environment={ "POSTGRES_DB": "codeindexerdb", - "POSTGRES_USER": "postgres", - "POSTGRES_PASSWORD": "postgres" + "POSTGRES_USER": "postgres", + "POSTGRES_PASSWORD": "postgres", }, ports={"5432/tcp": 54320}, - volumes={str(postgres_data_dir): {"bind": "/var/lib/postgresql/data", "mode": "rw"}}, + volumes={ + str(postgres_data_dir): { + "bind": "/var/lib/postgresql/data", + "mode": "rw", + } + }, detach=True, - remove=False + remove=False, ) # Wait for container to be ready @@ -121,13 +131,19 @@ def setup_pgvector_database(client: docker.DockerClient) -> bool: # Check if container is running container.reload() if container.status == "running": - console.print("[green]pgvector database setup completed successfully.[/green]") - console.print("[blue]Database connection: postgresql://postgres:postgres@localhost:54320/codeindexerdb[/blue]") + console.print( + "[green]pgvector database setup completed successfully.[/green]" + ) + console.print( + "[blue]Database connection: postgresql://postgres:postgres@localhost:54320/codeindexerdb[/blue]" + ) return True else: - console.print(f"[red]Failed to start pgvector container. Status: {container.status}[/red]") + console.print( + f"[red]Failed to start pgvector container. Status: {container.status}[/red]" + ) return False - + except DockerException as e: console.print(f"[red]Error setting up pgvector database: {e}[/red]") return False @@ -138,10 +154,10 @@ def setup_pgvector_database(client: docker.DockerClient) -> bool: def pull_sandboxed_kali_image(client: docker.DockerClient) -> bool: """Pull the sandboxed Kali image. - + Args: client: Docker client instance - + Returns: bool: True if pull successful, False otherwise """ @@ -160,10 +176,10 @@ def pull_sandboxed_kali_image(client: docker.DockerClient) -> bool: def stop_pgvector_container(client: docker.DockerClient) -> bool: """Stop the pgvector container. - + Args: client: Docker client instance - + Returns: bool: True if stopped successfully, False otherwise """ @@ -192,7 +208,7 @@ def init_cli_config(): """Initialize CLI config by prompting for env vars and saving to cache TOML. Writes to ~/.cache/deadend/config.toml - + Returns: Path: The path to the created configuration file """ @@ -204,13 +220,15 @@ def init_cli_config(): console.print("Please install Docker from: https://docs.docker.com/get-docker/") console.print("Make sure Docker daemon is running.") raise typer.Exit(1) - + # Check Docker availability first - exit if not available if not check_docker(docker_client): - console.print("\n[red]Docker is required for this application to function properly.[/red]") + console.print( + "\n[red]Docker is required for this application to function properly.[/red]" + ) console.print("Please install and start Docker, then run this command again.") raise typer.Exit(1) - + # Check and setup pgvector database if not check_pgvector_container(docker_client): console.print("\n[blue]pgvector database not found. Setting up...[/blue]") @@ -220,13 +238,15 @@ def init_cli_config(): raise typer.Exit(1) else: console.print("[green]pgvector database is already running.[/green]") - + # Pull sandboxed Kali image console.print("\n[blue]Setting up sandboxed Kali image...[/blue]") if not pull_sandboxed_kali_image(docker_client): - console.print("\n[yellow]Warning: Failed to pull sandboxed Kali image.[/yellow]") + console.print( + "\n[yellow]Warning: Failed to pull sandboxed Kali image.[/yellow]" + ) console.print("Some features may not work properly. You can try again later.") - + cache_dir = Path.home() / ".cache" / "deadend" cache_dir.mkdir(parents=True, exist_ok=True) config_file = cache_dir / "config.toml" @@ -236,24 +256,31 @@ def init_cli_config(): try: with config_file.open("r") as f: existing_config = toml.load(f) - + # Check if config has essential keys and values essential_keys = ["OPENAI_API_KEY", "ANTHROPIC_API_KEY", "GEMINI_API_KEY"] has_essential_config = any( - existing_config.get(key, "").strip() - for key in essential_keys + existing_config.get(key, "").strip() for key in essential_keys ) - + if has_essential_config: - console.print("[green]Configuration file already exists and is populated.[/green]") + console.print( + "[green]Configuration file already exists and is populated.[/green]" + ) console.print(f"Config file: {config_file}") - console.print("If you need to update the configuration, delete the file and run init again.") + console.print( + "If you need to update the configuration, delete the file and run init again." + ) return config_file else: - console.print("[yellow]Configuration file exists but appears to be empty or incomplete.[/yellow]") + console.print( + "[yellow]Configuration file exists but appears to be empty or incomplete.[/yellow]" + ) console.print("Proceeding with configuration setup...") except (toml.TomlDecodeError, OSError) as e: - console.print(f"[yellow]Warning: Could not read existing config file: {e}[/yellow]") + console.print( + f"[yellow]Warning: Could not read existing config file: {e}[/yellow]" + ) console.print("Proceeding with configuration setup...") # Read current environment as defaults @@ -265,7 +292,9 @@ def init_cli_config(): "GEMINI_API_KEY": os.getenv("GEMINI_API_KEY", ""), "GEMINI_MODEL": os.getenv("GEMINI_MODEL", "gemini-2.5-pro"), "EMBEDDING_MODEL": os.getenv("EMBEDDING_MODEL", ""), - "DB_URL": os.getenv("DB_URL", ""), + "DB_URL": os.getenv( + "DB_URL", "postgresql://postgres:postgres@localhost:54320/codeindexerdb" + ), "ZAP_PROXY_API_KEY": os.getenv("ZAP_PROXY_API_KEY", ""), "APP_ENV": os.getenv("APP_ENV", "development"), "LOG_LEVEL": os.getenv("LOG_LEVEL", "INFO"), diff --git a/deadend_cli/src/deadend_cli/rpc_server.py b/deadend_cli/src/deadend_cli/rpc_server.py index 45a65bf..f6f17f6 100644 --- a/deadend_cli/src/deadend_cli/rpc_server.py +++ b/deadend_cli/src/deadend_cli/rpc_server.py @@ -112,6 +112,7 @@ async def _run_task_stream( mode: str = "yolo", ): model_registry = ModelRegistry(config=self.config) + await model_registry.initialize() if not model_registry.has_any_model(): raise RuntimeError( "No LM model configured. Run `deadend init` to initialize the model configuration."