-
Notifications
You must be signed in to change notification settings - Fork 11
Open
Description
When making two streaming requests at the same time with subscription auth, the second one waits about 60 seconds after the first one finishes before it even starts.
Reproduction
Concurrent requests (shows the problem):
# Fire off two streaming requests simultaneously
time curl -N -X POST http://localhost:3456/messages \
-d '{"model":"claude-sonnet-4","messages":[{"role":"user","content":"test"}],"stream":true}' &
time curl -N -X POST http://localhost:3456/messages \
-d '{"model":"claude-haiku-4","messages":[{"role":"user","content":"test"}],"stream":true}'
# Result:
# First request: ~4 seconds
# Second request: ~63 seconds (waits 60s after first completes)Sequential requests (works fine):
# Run them one after another
time curl -N -X POST http://localhost:3456/messages \
-d '{"model":"claude-sonnet-4","messages":[{"role":"user","content":"test"}],"stream":true}'
time curl -N -X POST http://localhost:3456/messages \
-d '{"model":"claude-haiku-4","messages":[{"role":"user","content":"test"}],"stream":true}'
# Result:
# First request: ~4 seconds
# Second request: ~5 seconds
# Total: ~9 secondsWhy this matters
OpenCode makes 2 concurrent requests on every message - one for the actual response, one for generating a session title using Haiku. This means every new message has a 60 second wait if you're using this project.
Solution for OpenCode users
Set a different provider's model for title generation in your config:
{
"small_model": "zai-coding-plan/glm-4.7-flash"
}Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels