Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
206 changes: 206 additions & 0 deletions backend/load-tests/LOAD_TEST_RESULTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,206 @@
# BoxMeOut Stella — Load Test Results & Baseline Metrics

## Overview

This document tracks performance baselines for the BoxMeOut Stella prediction market platform. Run load tests regularly against staging and before production deploys.

## Prerequisites

```bash
# Install k6
brew install k6 # macOS
sudo apt install k6 # Ubuntu/Debian

# Start the backend server
cd backend && npm run dev
```

## Running Tests

```bash
cd backend/load-tests

# Run all scenarios
./run-all.sh

# Run a specific scenario
./run-all.sh --scenario baseline
./run-all.sh --scenario websocket
./run-all.sh --scenario predictions
./run-all.sh --scenario amm

# Override target
./run-all.sh --base-url http://staging.boxmeout.io:3000

# Run individual k6 scripts directly
k6 run scenarios/api-baseline.js
k6 run -e MARKET_ID=abc123 scenarios/predictions-burst.js
```

## Test Scenarios

### 1. API Baseline (`scenarios/api-baseline.js`)

Measures core API latency under moderate load (50 concurrent users).

| Endpoint | Method | Description |
|----------|--------|-------------|
| `/health` | GET | Basic health check |
| `/health/detailed` | GET | Service-level health |
| `/api/markets` | GET | List all markets |
| `/api/markets/:id` | GET | Single market detail |
| `/api/auth/challenge` | POST | Auth nonce request |
| `/metrics` | GET | Prometheus metrics |

**Thresholds:**
- `p(50)` < 200ms
- `p(95)` < 500ms
- `p(99)` < 1000ms
- Error rate < 5%

### 2. WebSocket Connections (`scenarios/websocket-connections.js`)

Ramps to 1000 concurrent WebSocket connections and sustains for 2 minutes.

| Phase | Duration | Target Connections |
|-------|----------|--------------------|
| Ramp 1 | 30s | 100 |
| Ramp 2 | 30s | 500 |
| Ramp 3 | 30s | 1000 |
| Sustain | 2m | 1000 |
| Ramp down | 30s | 0 |

**Thresholds:**
- Connection time `p(95)` < 2000ms
- Message latency `p(95)` < 500ms
- Error rate < 10%

### 3. Predictions Burst (`scenarios/predictions-burst.js`)

100 users simultaneously submit predictions on a single market, then sustained load.

| Phase | VUs | Duration | Description |
|-------|-----|----------|-------------|
| Burst | 100 | instant | All 100 commit at once |
| Sustained | 20→100 | 3m45s | Continuous prediction flow |

**Tested Operations:**
- `POST /api/markets/:id/predict` — Commit prediction
- `POST /api/predictions/:id/reveal` — Reveal prediction
- `POST /api/markets/:id/buy-shares` — Buy YES/NO shares
- `POST /api/markets/:id/sell-shares` — Sell shares

**Thresholds:**
- Commit `p(50)` < 500ms, `p(95)` < 2000ms
- Reveal `p(50)` < 500ms, `p(95)` < 2000ms
- Buy shares `p(50)` < 300ms, `p(95)` < 1000ms
- Error rate < 15% (blockchain operations may timeout)

### 4. AMM High-Frequency Trading (`scenarios/amm-high-frequency.js`)

Simulates high-frequency trading against the AMM with up to 200 trades/second.

| Phase | Duration | Rate | Description |
|-------|----------|------|-------------|
| Warmup | 30s | 5 VUs | Establish baseline prices |
| Ramp | 30s | 10 tx/s | Light trading |
| Ramp | 30s | 50 tx/s | Medium frequency |
| Stress | 1m | 100 tx/s | High frequency |
| Peak | 1m | 200 tx/s | Maximum stress |
| Cooldown | 45s | 50→0 tx/s | Wind down |

**Trade Distribution:**
- 40% Buy shares
- 30% Sell shares
- 20% Read pool state
- 10% Add liquidity

**Concurrent readers:** 20 VUs continuously reading pool state during trading.

**Thresholds:**
- Buy/Sell `p(50)` < 300ms, `p(95)` < 1500ms
- Pool state read `p(50)` < 100ms, `p(95)` < 300ms
- Trade error rate < 20%

---

## Baseline Metrics Template

Fill in after first test run:

### Environment
- **Date:** YYYY-MM-DD
- **Server:** (e.g., MacBook M2, EC2 t3.medium)
- **Node.js:** (version)
- **PostgreSQL:** (version)
- **Redis:** (version)
- **Network:** (local / staging / production)

### API Baseline Results

| Endpoint | p50 | p95 | p99 | Max | Error Rate |
|----------|-----|-----|-----|-----|------------|
| `GET /health` | —ms | —ms | —ms | —ms | —% |
| `GET /api/markets` | —ms | —ms | —ms | —ms | —% |
| `GET /api/markets/:id` | —ms | —ms | —ms | —ms | —% |
| `POST /api/auth/challenge` | —ms | —ms | —ms | —ms | —% |
| `GET /metrics` | —ms | —ms | —ms | —ms | —% |

### WebSocket Results

| Metric | Value |
|--------|-------|
| Peak concurrent connections | — |
| Connection time (p95) | —ms |
| Message latency (p95) | —ms |
| Connection error rate | —% |
| Messages received | — |

### Predictions Burst Results

| Operation | p50 | p95 | p99 | Success Rate |
|-----------|-----|-----|-----|--------------|
| Commit prediction | —ms | —ms | —ms | —% |
| Reveal prediction | —ms | —ms | —ms | —% |
| Buy shares | —ms | —ms | —ms | —% |
| Sell shares | —ms | —ms | —ms | —% |

### AMM High-Frequency Results

| Metric | Value |
|--------|-------|
| Total trades executed | — |
| Peak throughput (tx/sec) | — |
| Buy latency (p95) | —ms |
| Sell latency (p95) | —ms |
| Pool state read (p95) | —ms |
| Max slippage observed | —% |
| Trade error rate | —% |

---

## Interpreting Results

### Green (Pass)
- All percentile thresholds met
- Error rates within bounds
- No connection drops during sustained phase

### Yellow (Warning)
- p99 exceeding thresholds but p95 passing
- Error rate 5-15%
- Sporadic WebSocket disconnects

### Red (Fail)
- p95 thresholds breached
- Error rate > 15%
- WebSocket connections unable to sustain target
- AMM pool state inconsistency detected

## Common Bottlenecks

1. **Database connection pool exhaustion** — Increase Prisma pool size
2. **Redis connection limits** — Scale Redis or use connection pooling
3. **Blockchain RPC rate limiting** — Queue transactions, use batch calls
4. **Node.js event loop blocking** — Profile with `--prof`, move crypto to worker threads
5. **WebSocket memory** — Each connection ~50KB; 1000 connections = ~50MB baseline
54 changes: 54 additions & 0 deletions backend/load-tests/config.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
// Load test shared configuration for BoxMeOut Stella
// Usage: import { CONFIG, THRESHOLDS } from './config.js';

export const CONFIG = {
// Target server
BASE_URL: __ENV.BASE_URL || 'http://localhost:3000',
WS_URL: __ENV.WS_URL || 'ws://localhost:3000',

// Auth
ADMIN_PUBLIC_KEY: __ENV.ADMIN_PUBLIC_KEY || 'GCTEST000000000000000000000000000000000000000000000000000',

// Test market ID (set via env or use default)
MARKET_ID: __ENV.MARKET_ID || 'test-market-1',

// Timing
RAMP_UP_DURATION: '30s',
STEADY_STATE_DURATION: '2m',
RAMP_DOWN_DURATION: '15s',

// Rate limit aware — the API has 100 req/min per IP
API_RATE_LIMIT: 100,
};

// Shared k6 thresholds for pass/fail criteria
export const THRESHOLDS = {
// HTTP request duration targets
http_req_duration: [
'p(50)<200', // p50 under 200ms
'p(95)<500', // p95 under 500ms
'p(99)<1000', // p99 under 1s
],
// HTTP request failure rate
http_req_failed: [
'rate<0.05', // Less than 5% failure rate
],
// Custom metric thresholds (defined per-scenario)
};

// Common HTTP params
export const HEADERS = {
'Content-Type': 'application/json',
Accept: 'application/json',
};

// Generate a fake Stellar public key for load testing
export function generatePublicKey(vuId) {
const chars = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ234567';
let key = 'G';
const seed = `${vuId}-${Date.now()}`;
for (let i = 0; i < 55; i++) {
key += chars[(vuId * 7 + i * 13) % chars.length];
}
return key;
}
89 changes: 89 additions & 0 deletions backend/load-tests/helpers/auth.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
// Authentication helper for load tests
// Handles the challenge-sign-verify flow for generating JWT tokens
import http from 'k6/http';
import { check } from 'k6';
import { CONFIG, HEADERS } from '../config.js';

// Request a challenge nonce for a given public key
export function requestChallenge(publicKey) {
const res = http.post(
`${CONFIG.BASE_URL}/api/auth/challenge`,
JSON.stringify({ publicKey }),
{ headers: HEADERS, tags: { name: 'auth_challenge' } }
);

const success = check(res, {
'challenge: status 200': (r) => r.status === 200,
'challenge: has nonce': (r) => {
try {
const body = JSON.parse(r.body);
return !!body.nonce || !!(body.data && body.data.nonce);
} catch {
return false;
}
},
});

if (!success) {
return null;
}

try {
const body = JSON.parse(res.body);
return body.data || body;
} catch {
return null;
}
}

// Login with a pre-signed payload (for load testing, we skip real signing)
// In a real load test, you'd use a pool of pre-generated tokens
export function login(publicKey, nonce, signature) {
const res = http.post(
`${CONFIG.BASE_URL}/api/auth/login`,
JSON.stringify({ publicKey, nonce, signature }),
{ headers: HEADERS, tags: { name: 'auth_login' } }
);

const success = check(res, {
'login: status 200': (r) => r.status === 200,
});

if (!success) {
return null;
}

try {
const body = JSON.parse(res.body);
return body.data || body;
} catch {
return null;
}
}

// Build authorized headers from a token
export function authHeaders(token) {
return {
...HEADERS,
Authorization: `Bearer ${token}`,
};
}

// Refresh an access token
export function refreshToken(refreshToken) {
const res = http.post(
`${CONFIG.BASE_URL}/api/auth/refresh`,
JSON.stringify({ refreshToken }),
{ headers: HEADERS, tags: { name: 'auth_refresh' } }
);

if (res.status === 200) {
try {
const body = JSON.parse(res.body);
return body.data || body;
} catch {
return null;
}
}
return null;
}
2 changes: 2 additions & 0 deletions backend/load-tests/results/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
*.json
!.gitignore
1 change: 1 addition & 0 deletions backend/load-tests/results/.gitkeep
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
results/*.json
Loading