Skip to content

feat(proxy): add observability and rate limiting to API proxy#1038

Merged
Mossaka merged 13 commits intomainfrom
feat/api-proxy-observability-ratelimit
Feb 25, 2026
Merged

feat(proxy): add observability and rate limiting to API proxy#1038
Mossaka merged 13 commits intomainfrom
feat/api-proxy-observability-ratelimit

Conversation

@Mossaka
Copy link
Collaborator

@Mossaka Mossaka commented Feb 25, 2026

Summary

  • Observability: Structured JSON logging, in-memory metrics (counters, histograms, gauges), request tracing via X-Request-ID headers, enhanced /health endpoint with metrics summary, new /metrics endpoint
  • Rate Limiting: Per-provider sliding window rate limiter (RPM, RPH, bytes/min) with 429 responses, Retry-After headers, CLI configuration flags (--rate-limit-rpm, --rate-limit-rph, --rate-limit-bytes-pm, --no-rate-limit)
  • Zero new dependencies: All implemented with Node.js built-ins only (crypto.randomUUID, native data structures)

Changes

API Proxy (containers/api-proxy/)

File Description
logging.js Structured JSON logging module (generateRequestId, sanitizeForLog, logRequest)
metrics.js In-memory metrics: counters, histograms with fixed buckets, gauges, percentile calculation
rate-limiter.js Sliding window counter rate limiter with per-provider independence, fail-open behavior
server.js Integrated logging/metrics/rate-limiting into proxy flow, added /metrics endpoint, enhanced /health

TypeScript (src/)

File Description
cli.ts Added --rate-limit-rpm/rph/bytes-pm and --no-rate-limit flags with validation
types.ts Added RateLimitConfig interface and rateLimitConfig to WrapperConfig
docker-manager.ts Pass rate limit config as AWF_RATE_LIMIT_* env vars to api-proxy container
tests/fixtures/awf-runner.ts Extended AwfOptions with rate limit fields

Tests

File Tests
containers/api-proxy/logging.test.js 16 tests: UUID generation, sanitization, JSON output
containers/api-proxy/metrics.test.js 30 tests: counters, histograms, gauges, percentiles, memory bounds
containers/api-proxy/rate-limiter.test.js 29 tests: RPM/RPH/bytes limiting, provider independence, window rollover
src/docker-manager.test.ts +2 tests: rate limit env var passthrough
tests/integration/api-proxy-observability.test.ts 5 tests: /metrics, /health, X-Request-ID, metrics increment, rate_limits
tests/integration/api-proxy-rate-limit.test.ts 7 tests: 429 response, Retry-After, X-RateLimit headers, --no-rate-limit, custom RPM

CI

  • Added API proxy unit test step to build.yml

Design Decisions

  1. Zero dependencies — Uses crypto.randomUUID() for request IDs, native arrays for metrics. Keeps container image small and avoids supply chain risk.
  2. Fixed-bucket histograms — Memory bounded regardless of request volume. Buckets: [10, 50, 100, 250, 500, 1000, 2500, 5000, 10000, 30000] ms.
  3. Sliding window counter — 1-second granularity for per-minute limits, 1-minute granularity for per-hour limits. Smooth rate limiting without burst-at-boundary issues.
  4. Fail-open — Rate limiter catches internal errors and allows requests through. Never crashes the proxy.
  5. Backwards compatible — Health check is a superset of previous format. Rate limiting is on by default with generous limits (60 RPM).

Test plan

  • TypeScript builds cleanly (npm run build)
  • ESLint passes (0 errors)
  • 75 API proxy unit tests pass (containers/api-proxy/npm test)
  • 188 docker-manager tests pass
  • 162 CLI tests pass (1 pre-existing failure unrelated to this PR)
  • Integration tests pass in CI (Docker required)
  • Existing API proxy integration tests still pass
  • New observability integration tests pass (5 tests)
  • New rate limiting integration tests pass (7 tests)

🤖 Generated with Claude Code

Copilot AI review requested due to automatic review settings February 25, 2026 19:40
@github-actions
Copy link
Contributor

⚠️ Coverage Regression Detected

This PR decreases test coverage. Please add tests to maintain coverage levels.

Overall Coverage

Metric Base PR Delta
Lines 82.39% 81.57% 📉 -0.82%
Statements 82.32% 81.52% 📉 -0.80%
Functions 82.74% 82.74% ➡️ +0.00%
Branches 74.55% 73.19% 📉 -1.36%
📁 Per-file Coverage Changes (2 files)
File Lines (Before → After) Statements (Before → After)
src/cli.ts 43.8% → 40.9% (-2.84%) 43.8% → 41.0% (-2.83%)
src/docker-manager.ts 83.6% → 84.1% (+0.56%) 82.8% → 83.4% (+0.54%)

Coverage comparison generated by scripts/ci/compare-coverage.ts

@Mossaka Mossaka changed the title feat(api-proxy): add observability and rate limiting feat(proxy): add observability and rate limiting to API proxy Feb 25, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds observability (structured logs, in-memory metrics, tracing) and per-provider rate limiting to the API proxy sidecar, with CLI/config wiring to pass settings into the container and new unit/integration tests plus CI coverage.

Changes:

  • Implemented structured JSON logging, metrics collection, enhanced /health, and new /metrics endpoint in the API proxy.
  • Added per-provider rate limiting configuration via CLI flags and propagated settings to the api-proxy container via env vars.
  • Added unit tests for api-proxy modules, new integration tests, and a CI step to run api-proxy unit tests.

Reviewed changes

Copilot reviewed 16 out of 18 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
tests/integration/api-proxy-rate-limit.test.ts New integration coverage for 429/Retry-After and rate-limit-related endpoints/headers.
tests/integration/api-proxy-observability.test.ts New integration coverage for /metrics, /health, and X-Request-ID.
tests/fixtures/awf-runner.ts Extends runner options to pass rate-limit flags through the CLI.
src/types.ts Introduces RateLimitConfig and wires it into WrapperConfig.
src/cli.ts Adds CLI flags for rate limiting / disabling rate limiting and builds rateLimitConfig.
src/docker-manager.ts Propagates rate limit configuration to the api-proxy container via AWF_RATE_LIMIT_* env vars.
src/docker-manager.test.ts Adds tests asserting rate-limit env var passthrough behavior.
containers/api-proxy/server.js Integrates logging/metrics/rate-limiting and adds management endpoints handling.
containers/api-proxy/rate-limiter.js New sliding-window rate limiter implementation with env-based factory.
containers/api-proxy/rate-limiter.test.js Unit tests for rate limiter behavior (rpm/rph/bytes, rollover, fail-open, status).
containers/api-proxy/metrics.js New in-memory counters/histograms/gauges and summary helpers.
containers/api-proxy/metrics.test.js Unit tests for metrics behavior and output shape.
containers/api-proxy/logging.js New structured JSON logging helpers.
containers/api-proxy/logging.test.js Unit tests for logging helpers (UUID, sanitization, JSON output).
containers/api-proxy/package.json Adds npm test and jest dev dependency for api-proxy unit tests.
.github/workflows/build.yml Runs api-proxy unit tests in CI.
.gitignore Ignores design-docs/ working drafts.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 89 to 110
test('should include X-RateLimit headers in responses', async () => {
// Make a single request and check for rate limit headers
const result = await runner.runWithSudo(
`bash -c "curl -s -i -X POST http://${API_PROXY_IP}:10001/v1/messages -H 'Content-Type: application/json' -d '{\"model\":\"test\"}'"`,
{
allowDomains: ['api.anthropic.com'],
enableApiProxy: true,
buildLocal: true,
logLevel: 'debug',
timeout: 120000,
env: {
ANTHROPIC_API_KEY: 'sk-ant-fake-test-key-12345',
},
}
);

expect(result).toSucceed();
// Even non-429 responses from rate-limited requests should have rate limit headers.
// When rate limit IS triggered (429), headers are always present.
// For a single request at default limits, we might get the upstream response
// which won't have these headers. So use a low RPM and make 2 requests.
}, 180000);
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test doesn’t currently assert anything about the presence of X-RateLimit headers, so it can pass even if the feature regresses.

Either add assertions (e.g., force a 429 or otherwise ensure the proxy is the responder and then check for x-ratelimit-* headers), or remove/rename the test to reflect what it actually validates.

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the incomplete test. The subsequent test at line 112 already covers X-RateLimit header assertions properly by triggering a 429 with a low RPM.

Comment on lines 1014 to 1055
// Build rate limit config when API proxy is enabled
if (config.enableApiProxy) {
// --no-rate-limit flag: commander stores as `options.rateLimit === false`
const rateLimitDisabled = options.rateLimit === false;

const rateLimitConfig: RateLimitConfig = {
enabled: !rateLimitDisabled,
rpm: 60,
rph: 1000,
bytesPm: 52428800,
};

if (!rateLimitDisabled) {
if (options.rateLimitRpm !== undefined) {
const rpm = parseInt(options.rateLimitRpm, 10);
if (isNaN(rpm) || rpm <= 0) {
logger.error('❌ --rate-limit-rpm must be a positive integer');
process.exit(1);
}
rateLimitConfig.rpm = rpm;
}
if (options.rateLimitRph !== undefined) {
const rph = parseInt(options.rateLimitRph, 10);
if (isNaN(rph) || rph <= 0) {
logger.error('❌ --rate-limit-rph must be a positive integer');
process.exit(1);
}
rateLimitConfig.rph = rph;
}
if (options.rateLimitBytesPm !== undefined) {
const bytesPm = parseInt(options.rateLimitBytesPm, 10);
if (isNaN(bytesPm) || bytesPm <= 0) {
logger.error('❌ --rate-limit-bytes-pm must be a positive integer');
process.exit(1);
}
rateLimitConfig.bytesPm = bytesPm;
}
}

config.rateLimitConfig = rateLimitConfig;
logger.debug(`Rate limiting: enabled=${rateLimitConfig.enabled}, rpm=${rateLimitConfig.rpm}, rph=${rateLimitConfig.rph}, bytesPm=${rateLimitConfig.bytesPm}`);
}
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rate-limit flags are documented as requiring --enable-api-proxy, but if a user supplies --rate-limit-* or --no-rate-limit without --enable-api-proxy, they’re silently ignored because rateLimitConfig is only built when config.enableApiProxy is true.

For consistency with other dependent flags (e.g., --allow-urls requiring --ssl-bump), consider validating this combination and exiting with a clear error when rate-limit flags are used without enabling the API proxy.

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added validation: rate-limit flags without --enable-api-proxy now produce a clear error and exit(1).

Comment on lines 359 to 360
if (checkRateLimit(req, res, 'openai', 0)) return;

Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checkRateLimit is always being called with requestBytes=0, and the actual body size computed in proxyRequest is never used to enforce bytesPm. As a result, the "bytes per minute" limit is effectively a no-op, and any byte-based protection described by the flags/env vars won’t work.

Consider moving the rate-limit check inside proxyRequest after the request body is read (so you can pass requestBytes), or otherwise ensure limiter.check(provider, requestBytes) runs with the real body length before forwarding upstream.

Suggested change
if (checkRateLimit(req, res, 'openai', 0)) return;
const contentLengthHeader = req.headers['content-length'];
const requestBytes = contentLengthHeader ? Number(contentLengthHeader) || 0 : 0;
if (checkRateLimit(req, res, 'openai', requestBytes)) return;

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. All three provider handlers now read Content-Length from the request header and pass it to checkRateLimit(). The bytes-per-minute limit is now enforced.

Comment on lines 72 to 95
function checkRateLimit(req, res, provider, requestBytes) {
const check = limiter.check(provider, requestBytes);
if (!check.allowed) {
const requestId = req.headers['x-request-id'] || generateRequestId();
const limitLabels = { rpm: 'requests per minute', rph: 'requests per hour', bytes_pm: 'bytes per minute' };
const windowLabel = limitLabels[check.limitType] || check.limitType;

metrics.increment('rate_limit_rejected_total', { provider, limit_type: check.limitType });
logRequest('warn', 'rate_limited', {
request_id: requestId,
provider,
limit_type: check.limitType,
limit: check.limit,
retry_after: check.retryAfter,
});

res.writeHead(429, {
'Content-Type': 'application/json',
'Retry-After': String(check.retryAfter),
'X-RateLimit-Limit': String(check.limit),
'X-RateLimit-Remaining': String(check.remaining),
'X-RateLimit-Reset': String(check.resetAt),
'X-Request-ID': requestId,
});
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

x-request-id from the client is reflected directly into logs and response headers without validation/sanitization. This allows untrusted input to control log fields and a response header value.

Recommend validating the header (e.g., enforce a max length and a safe character set / UUID format) and falling back to a generated ID when invalid, before using it in logRequest or X-Request-ID.

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added isValidRequestId() helper that validates format (alphanumeric + dashes/dots, max 128 chars) and falls back to generated UUID for invalid input. Applied in both proxyRequest() and checkRateLimit().

Comment on lines 183 to 184
const resetAt = (nowSec + 1) + (MINUTE_SLOTS - 1);
const retryAfter = Math.max(1, MINUTE_SLOTS - (nowSec % MINUTE_SLOTS));
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For RPM limit violations, resetAt is computed as nowSec + 60 regardless of the current offset within the minute. This can disagree with retryAfter (and with the reset calculation in getStatus), causing X-RateLimit-Reset to be later than necessary.

Compute resetAt consistently as nowSec + retryAfter (or align to the same minute-boundary logic used elsewhere).

Suggested change
const resetAt = (nowSec + 1) + (MINUTE_SLOTS - 1);
const retryAfter = Math.max(1, MINUTE_SLOTS - (nowSec % MINUTE_SLOTS));
const retryAfter = Math.max(1, MINUTE_SLOTS - (nowSec % MINUTE_SLOTS));
const resetAt = nowSec + retryAfter;

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed resetAt to use nowSec + retryAfter consistently.

Comment on lines 85 to 103
/**
* Get the sliding window estimate of the current rate.
*
* Uses the formula: current_window_count + previous_window_weight * previous_total
* where previous_window_weight = (slot_duration - elapsed_in_current_slot) / slot_duration
*
* This is a simplified but effective approach: we use the total across
* all current-window slots plus a weighted fraction of the oldest expired slot's
* contribution to approximate the true sliding window.
*
* @param {object} win - Window object
* @param {number} now - Current time in the slot's unit
* @param {number} size - Window size
* @returns {number} Estimated count in the window
*/
function getWindowCount(win, now, size) {
advanceWindow(win, now, size);
return win.total;
}
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The module-level comment and getWindowCount doc describe using a weighted portion of the previous window, but the implementation currently returns win.total after advancing/zeroing slots (no weighting).

Either update the documentation to match the implemented ring-buffer approach, or implement the weighted sliding-window behavior described here to avoid misleading future maintainers.

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated JSDoc to accurately describe the ring-buffer approach (advance/zero stale slots, return sum of active counts).

Comment on lines 316 to 322
function create() {
const rpm = parseInt(process.env.AWF_RATE_LIMIT_RPM, 10) || DEFAULT_RPM;
const rph = parseInt(process.env.AWF_RATE_LIMIT_RPH, 10) || DEFAULT_RPH;
const bytesPm = parseInt(process.env.AWF_RATE_LIMIT_BYTES_PM, 10) || DEFAULT_BYTES_PM;
const enabled = process.env.AWF_RATE_LIMIT_ENABLED !== 'false';

return new RateLimiter({ rpm, rph, bytesPm, enabled });
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

create() uses parseInt(env, 10) || DEFAULT_*, which will accept negative values (e.g., "-1") and treat them as valid limits. That can lead to surprising behavior (e.g., immediate throttling or nonsensical remaining counts) if someone configures env vars directly.

Consider validating parsed env values are positive integers and falling back to defaults (or throwing) when invalid.

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. create() now validates with Number.isFinite(raw) && raw > 0 before using parsed env values. Added 3 unit tests for negative, zero, and non-numeric env vars.

@github-actions
Copy link
Contributor

github-actions bot commented Feb 25, 2026

⚠️ Coverage Regression Detected

This PR decreases test coverage. Please add tests to maintain coverage levels.

Overall Coverage

Metric Base PR Delta
Lines 82.39% 82.20% 📉 -0.19%
Statements 82.32% 82.16% 📉 -0.16%
Functions 82.74% 82.82% 📈 +0.08%
Branches 74.55% 74.34% 📉 -0.21%
📁 Per-file Coverage Changes (2 files)
File Lines (Before → After) Statements (Before → After)
src/docker-manager.ts 83.6% → 84.1% (+0.56%) 82.8% → 83.4% (+0.54%)
src/cli.ts 43.8% → 44.9% (+1.15%) 43.8% → 45.4% (+1.58%)

Coverage comparison generated by scripts/ci/compare-coverage.ts

@github-actions
Copy link
Contributor

🟢 Build Test: Node.js — PASS

Project Install Tests Status
clsx ✅ All passed PASS
execa ✅ All passed PASS
p-limit ✅ All passed PASS

Overall: ✅ PASS

Generated by Build Test Node.js for issue #1038

@github-actions
Copy link
Contributor

Deno Build Test Results

Project Tests Status
oak 1/1 ✅ PASS
std 1/1 ✅ PASS

Overall: ✅ PASS

Deno version: 2.7.1

Generated by Build Test Deno for issue #1038

@github-actions
Copy link
Contributor

Build Test: Bun Results ✅

Project Install Tests Status
elysia 1/1 PASS
hono 1/1 PASS

Overall: PASS

Bun version: 1.3.9

Generated by Build Test Bun for issue #1038

@github-actions
Copy link
Contributor

.NET Build Test Results

Project Restore Build Run Status
hello-world PASS
json-parse PASS

Overall: PASS

Run outputs

hello-world: Hello, World!

json-parse:

{
  "Name": "AWF Test",
  "Version": 1,
  "Success": true
}
Name: AWF Test, Success: True

Generated by Build Test .NET for issue #1038

@github-actions
Copy link
Contributor

🤖 Smoke test results for run 22413712321@Mossaka

Test Result
GitHub MCP (last 2 merged PRs) #1036 docs: add integration test coverage guide with gap analysis, #1035 feat: group --help flags by category, hide dev-only options
Playwright (github.com title) ✅ Title contains "GitHub"
File write /tmp/gh-aw/agent/smoke-test-copilot-22413712321.txt created
Bash (cat verify) ✅ File content confirmed

Overall: PASS

📰 BREAKING: Report filed by Smoke Copilot for issue #1038

@github-actions
Copy link
Contributor

Go Build Test Results 🟢

Project Download Tests Status
color 1/1 PASS
env 1/1 PASS
uuid 1/1 PASS

Overall: PASS

Generated by Build Test Go for issue #1038

@github-actions
Copy link
Contributor

Smoke Test Results

✅ GitHub MCP: #1036 "docs: add integration test coverage guide with gap analysis", #1035 "feat: group --help flags by category, hide dev-only options"
✅ Playwright: GitHub page title confirmed contains "GitHub"
✅ File Write: /tmp/gh-aw/agent/smoke-test-claude-22413712283.txt created
✅ Bash: File content verified

Overall: PASS

💥 [THE END] — Illustrated by Smoke Claude for issue #1038

@github-actions
Copy link
Contributor

Rust Build Test Results

Project Build Tests Status
fd 1/1 PASS
zoxide 1/1 PASS

Overall: PASS

Generated by Build Test Rust for issue #1038

@github-actions
Copy link
Contributor

PR titles: test: fix docker-warning tests and fragile timing dependencies | test: add CI workflow for non-chroot integration tests
1 GitHub MCP ✅; 2 safeinputs-gh ✅; 3 Playwright ✅; 4 Tavily ❌
5 File write ✅; 6 Bash cat ✅; 7 Discussion ✅; 8 Build ✅
Overall: FAIL

🔮 The oracle has spoken through Smoke Codex for issue #1038

@github-actions
Copy link
Contributor

Java Build Test Results

Project Compile Tests Status
gson 1/1 PASS
caffeine 1/1 PASS

Overall: PASS

Generated by Build Test Java for issue #1038

@github-actions
Copy link
Contributor

C++ Build Test Results

Project CMake Build Status
fmt PASS
json PASS

Overall: PASS

Generated by Build Test C++ for issue #1038

@github-actions
Copy link
Contributor

Chroot Version Comparison Results

Runtime Host Version Chroot Version Match?
Python Python 3.12.12 Python 3.12.3
Node.js v24.13.1 v20.20.0
Go go1.22.12 go1.22.12

Result: FAILED — Python and Node.js versions differ between host and chroot environments. Go versions match.

Tested by Smoke Chroot for issue #1038

@github-actions
Copy link
Contributor

Deno Build Test Results

Project Tests Status
oak 1/1 ✅ PASS
std 1/1 ✅ PASS

Overall: ✅ PASS

Generated by Build Test Deno for issue #1038

@github-actions
Copy link
Contributor

Smoke Test Results

Overall: PASS

💥 [THE END] — Illustrated by Smoke Claude for issue #1038

Mossaka and others added 10 commits February 25, 2026 22:52
Add two integration test files that verify the observability and rate
limiting features work end-to-end with actual Docker containers.

api-proxy-observability.test.ts:
- /metrics endpoint returns valid JSON with counters, histograms, gauges
- /health endpoint includes metrics_summary
- X-Request-ID header in proxy responses
- Metrics increment after API requests
- rate_limits appear in /health

api-proxy-rate-limit.test.ts:
- 429 response when RPM limit exceeded
- Retry-After header in 429 response
- X-RateLimit-* headers in 429 response
- --no-rate-limit flag disables limiting
- Custom RPM reflected in /health
- Rate limit metrics in /metrics after rejection

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Refactor rate limit validation into a standalone exported function
that can be tested independently. Adds 12 unit tests covering
defaults, --no-rate-limit, custom values, and validation errors.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The Dockerfile only copied server.js, but server.js now requires
logging.js, metrics.js, and rate-limiter.js. Without these files
the container exits immediately on startup, causing all agentic
workflow CI jobs to fail with "container awf-api-proxy exited (0)".

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Extract rate-limit flag validation into a standalone function with
7 unit tests covering all flag combinations. This addresses the
coverage regression from adding validation code inside the action
handler that couldn't be reached by unit tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…miting

Covers 12 scenarios including basic observability, rate limiting,
per-provider independence, and corner cases (lying Content-Length,
X-Request-ID injection, chunked transfers, window rollover, concurrent
load). Documents known gaps in the bytes-per-minute enforcement.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three fixes from devil's advocate review:

1. **Gauge double-decrement bug**: Added `errored` guard flag in
   proxyRequest() to prevent req.on('error') followed by req.on('end')
   from double-decrementing the active_requests gauge.

2. **retryAfter misaligned with sliding window**: Replaced calendar-
   aligned calculation (MINUTE_SLOTS - nowSec % MINUTE_SLOTS) with
   estimateRetryAfter() that scans the window to find when enough
   capacity will be freed. Also fixed getStatus() reset values.

3. **RPM default raised from 60 to 180**: 60 RPM (1 req/sec) is too
   restrictive for agents doing rapid tool calls. 180 RPM (3 req/sec)
   is comfortable for normal agent workflows while still catching
   runaway loops.

Bonus: Fixed total drift in advanceWindow() — on full-window clear
(elapsed >= size), set total=0 directly instead of subtracting.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Rate limiting is now disabled by default. Users opt in by providing
any --rate-limit-* flag:

  awf --enable-api-proxy --rate-limit-rpm 600 ...

When no rate-limit flags are provided, there are no request limits.
When any flag is provided, defaults for unset limits are generous
(600 RPM, 10000 RPH, 50MB/min bytes).

--no-rate-limit remains available to explicitly disable when other
flags might be set.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New observability tests:
- Custom X-Request-ID is preserved when valid
- Invalid X-Request-ID (<script> tags) is rejected, UUID generated
- active_requests gauge returns to 0 after request completes
- Latency histogram has entries with count > 0 after requests

New rate limiting tests:
- Default behavior: no 429s without any --rate-limit-* flags (opt-in)

Also fixed existing test: rate_limits in /health now requires
--rate-limit-rpm flag since rate limiting is opt-in.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@Mossaka Mossaka force-pushed the feat/api-proxy-observability-ratelimit branch from 6f1898f to 5080873 Compare February 25, 2026 22:52
@github-actions
Copy link
Contributor

⚠️ Coverage Regression Detected

This PR decreases test coverage. Please add tests to maintain coverage levels.

Overall Coverage

Metric Base PR Delta
Lines 82.29% 82.22% 📉 -0.07%
Statements 82.22% 82.18% 📉 -0.04%
Functions 82.74% 82.91% 📈 +0.17%
Branches 74.46% 74.82% 📈 +0.36%
📁 Per-file Coverage Changes (2 files)
File Lines (Before → After) Statements (Before → After)
src/docker-manager.ts 83.4% → 83.9% (+0.56%) 82.6% → 83.2% (+0.55%)
src/cli.ts 43.8% → 46.3% (+2.57%) 43.8% → 46.8% (+2.97%)

Coverage comparison generated by scripts/ci/compare-coverage.ts

@github-actions
Copy link
Contributor

🧪 Build Test: Bun Results

Project Install Tests Status
elysia 1/1 PASS
hono 1/1 PASS

Overall: ✅ PASS

Bun version: 1.3.9

Generated by Build Test Bun for issue #1038

@github-actions
Copy link
Contributor

C++ Build Test Results

Project CMake Build Status
fmt PASS
json PASS

Overall: PASS

Generated by Build Test C++ for issue #1038

@github-actions
Copy link
Contributor

Smoke Test Results — run 22419520869

✅ GitHub MCP — Last 2 merged PRs: #1056 "refactor: remove --allow-full-filesystem-access flag", #1055 "feat: add API proxy port 10004 for OpenCode engine" (both by @Mossaka)
✅ Playwright — github.com title contains "GitHub"
✅ File write — /tmp/gh-aw/agent/smoke-test-copilot-22419520869.txt created
✅ Bash — file content verified

Overall: PASS | PR author: @Mossaka | No assignees

📰 BREAKING: Report filed by Smoke Copilot for issue #1038

@github-actions
Copy link
Contributor

Go Build Test Results

Project Download Tests Status
color PASS PASS
env PASS PASS
uuid PASS PASS

Overall: ✅ PASS

Generated by Build Test Go for issue #1038

@github-actions
Copy link
Contributor

Node.js Build Test Results

Project Install Tests Status
clsx PASS PASS
execa PASS PASS
p-limit PASS PASS

Overall: PASS

Generated by Build Test Node.js for issue #1038

@github-actions
Copy link
Contributor

🦀 Rust Build Test Results

Project Build Tests Status
fd 1/1 PASS
zoxide 1/1 PASS

Overall: ✅ PASS

Generated by Build Test Rust for issue #1038

@github-actions
Copy link
Contributor

Smoke Test Results

Overall: PASS

💥 [THE END] — Illustrated by Smoke Claude for issue #1038

@github-actions
Copy link
Contributor

Deno Build Test Results

Project Tests Status
oak 1/1 ✅ PASS
std 1/1 ✅ PASS

Overall: ✅ PASS

All Deno tests passed successfully (Deno 2.7.1).

Generated by Build Test Deno for issue #1038

@github-actions
Copy link
Contributor

Java Build Test Results ☕

Project Compile Tests Status
gson 1/1 PASS
caffeine 1/1 PASS

Overall: ✅ PASS

All Java projects compiled and tested successfully via Maven with the AWF proxy (172.30.0.10:3128).

Generated by Build Test Java for issue #1038

@github-actions
Copy link
Contributor

.NET Build Test Results

Project Restore Build Run Status
hello-world PASS
json-parse PASS

Overall: PASS

Run output

hello-world:

Hello, World!
```

**json-parse:**
```
{
  "Name": "AWF Test",
  "Version": 1,
  "Success": true
}
Name: AWF Test, Success: True

Generated by Build Test .NET for issue #1038

@github-actions
Copy link
Contributor

PR titles:
refactor: remove --allow-full-filesystem-access flag
feat: add API proxy port 10004 for OpenCode engine
Tests: GitHub MCP ✅; SafeInputs GH ✅; Playwright ✅; Tavily search ❌; File write ✅; Bash cat ✅; Discussion comment ✅; Build ✅
Overall: FAIL

🔮 The oracle has spoken through Smoke Codex for issue #1038

@github-actions
Copy link
Contributor

Chroot Version Comparison Results

Runtime Host Version Chroot Version Match?
Python Python 3.12.12 Python 3.12.3 ❌ NO
Node.js v24.13.1 v20.20.0 ❌ NO
Go go1.22.12 go1.22.12 ✅ YES

Result: ⚠️ Not all versions match. Go matches, but Python and Node.js versions differ between host and chroot environment.

Tested by Smoke Chroot for issue #1038

@Mossaka Mossaka merged commit 654da56 into main Feb 25, 2026
91 of 92 checks passed
@Mossaka Mossaka deleted the feat/api-proxy-observability-ratelimit branch February 25, 2026 23:07
@github-actions
Copy link
Contributor

Smoke Test Results (run 22422835031)

Overall: PASS

💥 [THE END] — Illustrated by Smoke Claude

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants