Netwalls · GoSTEAN · Feb 21, 2026 · Feb 20, 2026
diff --git a/backend/load-tests/LOAD_TEST_RESULTS.md b/backend/load-tests/LOAD_TEST_RESULTS.md
@@ -0,0 +1,206 @@
+# BoxMeOut Stella — Load Test Results & Baseline Metrics
+
+## Overview
+
+This document tracks performance baselines for the BoxMeOut Stella prediction market platform. Run load tests regularly against staging and before production deploys.
+
+## Prerequisites
+
+```bash
+# Install k6
+brew install k6          # macOS
+sudo apt install k6      # Ubuntu/Debian
+
+# Start the backend server
+cd backend && npm run dev
+```
+
+## Running Tests
+
+```bash
+cd backend/load-tests
+
+# Run all scenarios
+./run-all.sh
+
+# Run a specific scenario
+./run-all.sh --scenario baseline
+./run-all.sh --scenario websocket
+./run-all.sh --scenario predictions
+./run-all.sh --scenario amm
+
+# Override target
+./run-all.sh --base-url http://staging.boxmeout.io:3000
+
+# Run individual k6 scripts directly
+k6 run scenarios/api-baseline.js
+k6 run -e MARKET_ID=abc123 scenarios/predictions-burst.js
+```
+
+## Test Scenarios
+
+### 1. API Baseline (`scenarios/api-baseline.js`)
+
+Measures core API latency under moderate load (50 concurrent users).
+
+| Endpoint | Method | Description |
+|----------|--------|-------------|
+| `/health` | GET | Basic health check |
+| `/health/detailed` | GET | Service-level health |
+| `/api/markets` | GET | List all markets |
+| `/api/markets/:id` | GET | Single market detail |
+| `/api/auth/challenge` | POST | Auth nonce request |
+| `/metrics` | GET | Prometheus metrics |
+
+**Thresholds:**
+- `p(50)` < 200ms
+- `p(95)` < 500ms
+- `p(99)` < 1000ms
+- Error rate < 5%
+
+### 2. WebSocket Connections (`scenarios/websocket-connections.js`)
+
+Ramps to 1000 concurrent WebSocket connections and sustains for 2 minutes.
+
+| Phase | Duration | Target Connections |
+|-------|----------|--------------------|
+| Ramp 1 | 30s | 100 |
+| Ramp 2 | 30s | 500 |
+| Ramp 3 | 30s | 1000 |
+| Sustain | 2m | 1000 |
+| Ramp down | 30s | 0 |
+
+**Thresholds:**
+- Connection time `p(95)` < 2000ms
+- Message latency `p(95)` < 500ms
+- Error rate < 10%
+
+### 3. Predictions Burst (`scenarios/predictions-burst.js`)
+
+100 users simultaneously submit predictions on a single market, then sustained load.
+
+| Phase | VUs | Duration | Description |
+|-------|-----|----------|-------------|
+| Burst | 100 | instant | All 100 commit at once |
+| Sustained | 20→100 | 3m45s | Continuous prediction flow |
+
+**Tested Operations:**
+- `POST /api/markets/:id/predict` — Commit prediction
+- `POST /api/predictions/:id/reveal` — Reveal prediction
+- `POST /api/markets/:id/buy-shares` — Buy YES/NO shares
+- `POST /api/markets/:id/sell-shares` — Sell shares
+
+**Thresholds:**
+- Commit `p(50)` < 500ms, `p(95)` < 2000ms
+- Reveal `p(50)` < 500ms, `p(95)` < 2000ms
+- Buy shares `p(50)` < 300ms, `p(95)` < 1000ms
+- Error rate < 15% (blockchain operations may timeout)
+
+### 4. AMM High-Frequency Trading (`scenarios/amm-high-frequency.js`)
+
+Simulates high-frequency trading against the AMM with up to 200 trades/second.
+
+| Phase | Duration | Rate | Description |
+|-------|----------|------|-------------|
+| Warmup | 30s | 5 VUs | Establish baseline prices |
+| Ramp | 30s | 10 tx/s | Light trading |
+| Ramp | 30s | 50 tx/s | Medium frequency |
+| Stress | 1m | 100 tx/s | High frequency |
+| Peak | 1m | 200 tx/s | Maximum stress |
+| Cooldown | 45s | 50→0 tx/s | Wind down |
+
+**Trade Distribution:**
+- 40% Buy shares
+- 30% Sell shares
+- 20% Read pool state
+- 10% Add liquidity
+
+**Concurrent readers:** 20 VUs continuously reading pool state during trading.
+
+**Thresholds:**
+- Buy/Sell `p(50)` < 300ms, `p(95)` < 1500ms
+- Pool state read `p(50)` < 100ms, `p(95)` < 300ms
+- Trade error rate < 20%
+
+---
+
+## Baseline Metrics Template
+
+Fill in after first test run:
+
+### Environment
+- **Date:** YYYY-MM-DD
+- **Server:** (e.g., MacBook M2, EC2 t3.medium)
+- **Node.js:** (version)
+- **PostgreSQL:** (version)
+- **Redis:** (version)
+- **Network:** (local / staging / production)
+
+### API Baseline Results
+
+| Endpoint | p50 | p95 | p99 | Max | Error Rate |
+|----------|-----|-----|-----|-----|------------|
+| `GET /health` | —ms | —ms | —ms | —ms | —% |
+| `GET /api/markets` | —ms | —ms | —ms | —ms | —% |
+| `GET /api/markets/:id` | —ms | —ms | —ms | —ms | —% |
+| `POST /api/auth/challenge` | —ms | —ms | —ms | —ms | —% |
+| `GET /metrics` | —ms | —ms | —ms | —ms | —% |
+
+### WebSocket Results
+
+| Metric | Value |
+|--------|-------|
+| Peak concurrent connections | — |
+| Connection time (p95) | —ms |
+| Message latency (p95) | —ms |
+| Connection error rate | —% |
+| Messages received | — |
+
+### Predictions Burst Results
+
+| Operation | p50 | p95 | p99 | Success Rate |
+|-----------|-----|-----|-----|--------------|
+| Commit prediction | —ms | —ms | —ms | —% |
+| Reveal prediction | —ms | —ms | —ms | —% |
+| Buy shares | —ms | —ms | —ms | —% |
+| Sell shares | —ms | —ms | —ms | —% |
+
+### AMM High-Frequency Results
+
+| Metric | Value |
+|--------|-------|
+| Total trades executed | — |
+| Peak throughput (tx/sec) | — |
+| Buy latency (p95) | —ms |
+| Sell latency (p95) | —ms |
+| Pool state read (p95) | —ms |
+| Max slippage observed | —% |
+| Trade error rate | —% |
+
+---
+
+## Interpreting Results
+
+### Green (Pass)
+- All percentile thresholds met
+- Error rates within bounds
+- No connection drops during sustained phase
+
+### Yellow (Warning)
+- p99 exceeding thresholds but p95 passing
+- Error rate 5-15%
+- Sporadic WebSocket disconnects
+
+### Red (Fail)
+- p95 thresholds breached
+- Error rate > 15%
+- WebSocket connections unable to sustain target
+- AMM pool state inconsistency detected
+
+## Common Bottlenecks
+
+1. **Database connection pool exhaustion** — Increase Prisma pool size
+2. **Redis connection limits** — Scale Redis or use connection pooling
+3. **Blockchain RPC rate limiting** — Queue transactions, use batch calls
+4. **Node.js event loop blocking** — Profile with `--prof`, move crypto to worker threads
+5. **WebSocket memory** — Each connection ~50KB; 1000 connections = ~50MB baseline
diff --git a/backend/load-tests/config.js b/backend/load-tests/config.js
@@ -0,0 +1,54 @@
+// Load test shared configuration for BoxMeOut Stella
+// Usage: import { CONFIG, THRESHOLDS } from './config.js';
+
+export const CONFIG = {
+  // Target server
+  BASE_URL: __ENV.BASE_URL || 'http://localhost:3000',
+  WS_URL: __ENV.WS_URL || 'ws://localhost:3000',
+
+  // Auth
+  ADMIN_PUBLIC_KEY: __ENV.ADMIN_PUBLIC_KEY || 'GCTEST000000000000000000000000000000000000000000000000000',
+
+  // Test market ID (set via env or use default)
+  MARKET_ID: __ENV.MARKET_ID || 'test-market-1',
+
+  // Timing
+  RAMP_UP_DURATION: '30s',
+  STEADY_STATE_DURATION: '2m',
+  RAMP_DOWN_DURATION: '15s',
+
+  // Rate limit aware — the API has 100 req/min per IP
+  API_RATE_LIMIT: 100,
+};
+
+// Shared k6 thresholds for pass/fail criteria
+export const THRESHOLDS = {
+  // HTTP request duration targets
+  http_req_duration: [
+    'p(50)<200',   // p50 under 200ms
+    'p(95)<500',   // p95 under 500ms
+    'p(99)<1000',  // p99 under 1s
+  ],
+  // HTTP request failure rate
+  http_req_failed: [
+    'rate<0.05',   // Less than 5% failure rate
+  ],
+  // Custom metric thresholds (defined per-scenario)
+};
+
+// Common HTTP params
+export const HEADERS = {
+  'Content-Type': 'application/json',
+  Accept: 'application/json',
+};
+
+// Generate a fake Stellar public key for load testing
+export function generatePublicKey(vuId) {
+  const chars = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ234567';
+  let key = 'G';
+  const seed = `${vuId}-${Date.now()}`;
+  for (let i = 0; i < 55; i++) {
+    key += chars[(vuId * 7 + i * 13) % chars.length];
+  }
+  return key;
+}
diff --git a/backend/load-tests/helpers/auth.js b/backend/load-tests/helpers/auth.js
@@ -0,0 +1,89 @@
+// Authentication helper for load tests
+// Handles the challenge-sign-verify flow for generating JWT tokens
+import http from 'k6/http';
+import { check } from 'k6';
+import { CONFIG, HEADERS } from '../config.js';
+
+// Request a challenge nonce for a given public key
+export function requestChallenge(publicKey) {
+  const res = http.post(
+    `${CONFIG.BASE_URL}/api/auth/challenge`,
+    JSON.stringify({ publicKey }),
+    { headers: HEADERS, tags: { name: 'auth_challenge' } }
+  );
+
+  const success = check(res, {
+    'challenge: status 200': (r) => r.status === 200,
+    'challenge: has nonce': (r) => {
+      try {
+        const body = JSON.parse(r.body);
+        return !!body.nonce || !!(body.data && body.data.nonce);
+      } catch {
+        return false;
+      }
+    },
+  });
+
+  if (!success) {
+    return null;
+  }
+
+  try {
+    const body = JSON.parse(res.body);
+    return body.data || body;
+  } catch {
+    return null;
+  }
+}
+
+// Login with a pre-signed payload (for load testing, we skip real signing)
+// In a real load test, you'd use a pool of pre-generated tokens
+export function login(publicKey, nonce, signature) {
+  const res = http.post(
+    `${CONFIG.BASE_URL}/api/auth/login`,
+    JSON.stringify({ publicKey, nonce, signature }),
+    { headers: HEADERS, tags: { name: 'auth_login' } }
+  );
+
+  const success = check(res, {
+    'login: status 200': (r) => r.status === 200,
+  });
+
+  if (!success) {
+    return null;
+  }
+
+  try {
+    const body = JSON.parse(res.body);
+    return body.data || body;
+  } catch {
+    return null;
+  }
+}
+
+// Build authorized headers from a token
+export function authHeaders(token) {
+  return {
+    ...HEADERS,
+    Authorization: `Bearer ${token}`,
+  };
+}
+
+// Refresh an access token
+export function refreshToken(refreshToken) {
+  const res = http.post(
+    `${CONFIG.BASE_URL}/api/auth/refresh`,
+    JSON.stringify({ refreshToken }),
+    { headers: HEADERS, tags: { name: 'auth_refresh' } }
+  );
+
+  if (res.status === 200) {
+    try {
+      const body = JSON.parse(res.body);
+      return body.data || body;
+    } catch {
+      return null;
+    }
+  }
+  return null;
+}
diff --git a/backend/load-tests/results/.gitignore b/backend/load-tests/results/.gitignore
@@ -0,0 +1,2 @@
+*.json
+!.gitignore
diff --git a/backend/load-tests/results/.gitkeep b/backend/load-tests/results/.gitkeep
@@ -0,0 +1 @@
+results/*.json