Skip to content

docs(test): add Phase 7B.6 latency benchmark protocol#184

Merged
PetrAnto merged 1 commit intomainfrom
claude/review-model-sync-gVjuw
Feb 25, 2026
Merged

docs(test): add Phase 7B.6 latency benchmark protocol#184
PetrAnto merged 1 commit intomainfrom
claude/review-model-sync-gVjuw

Conversation

@PetrAnto
Copy link
Owner

5 representative tasks testing each 7B optimization:

  • Task A: Simple chat → 7B.2 model routing (< 5s, fast model)
  • Task B: Multi-tool → 7B.1 speculative execution (< 20s, 2 tools/1 iter)
  • Task C: GitHub read → 7B.3+7B.4 prefetch+injection (< 30s, ≤ 3 iter)
  • Task D: Orchestra → all optimizations end-to-end (< 3min, ≤ 15 iter)
  • Task E: Reasoning → 7B.5 streaming feedback (first update < 3s)

Includes pass/conditional/fail criteria and comparison notes.

https://claude.ai/code/session_01K2mQTABDGY7DnnposPdDjw

5 representative tasks testing each 7B optimization:
- Task A: Simple chat → 7B.2 model routing (< 5s, fast model)
- Task B: Multi-tool → 7B.1 speculative execution (< 20s, 2 tools/1 iter)
- Task C: GitHub read → 7B.3+7B.4 prefetch+injection (< 30s, ≤ 3 iter)
- Task D: Orchestra → all optimizations end-to-end (< 3min, ≤ 15 iter)
- Task E: Reasoning → 7B.5 streaming feedback (first update < 3s)

Includes pass/conditional/fail criteria and comparison notes.

https://claude.ai/code/session_01K2mQTABDGY7DnnposPdDjw
@chatgpt-codex-connector
Copy link

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

@PetrAnto PetrAnto merged commit 76a29ee into main Feb 25, 2026
0 of 5 checks passed
@github-actions
Copy link

E2E Test Recording (base)

❌ Tests failed

E2E Test Video

@github-actions
Copy link

E2E Test Recording (workers-ai)

❌ Tests failed

E2E Test Video

@github-actions
Copy link

E2E Test Recording (discord)

❌ Tests failed

E2E Test Video

@github-actions
Copy link

E2E Test Recording (telegram)

❌ Tests failed

E2E Test Video

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants