-
Notifications
You must be signed in to change notification settings - Fork 177
feat: add stream mode support #282
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
✅ Deploy Preview for vllm-semantic-router ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
👥 vLLM Semantic Team NotificationThe following members have been identified for the changed files in this PR and have been automatically assigned: 📁
|
@tao12345666333 can you review this? thanks |
@AkisAya can you sign DCO?
|
Yes, please assign to me, I will review this PR tomorrow |
d85c966
to
79dcce1
Compare
Signed-off-by: akisaya <akikevinsama@gmail.com>
79dcce1
to
0e4f210
Compare
signed |
have u tested when u set stream=true and got stream response? |
tested it and working great, thanks! |
What type of PR is this?
fix:
add a workaround to preserve stream filed in user raw request, because openai-go's
openai.ChatCompletionNewParams
doesn't have this field, see issue #209What this PR does / why we need it:
stream mode is not supported well, when user use an auto model, stream filed in of user raw input is dropped, so a non-sse response is returned , this pr reuse the
ExpectStreamingResponse
field introduced by this pr https://github.com/vllm-project/semantic-router/pull/203/files, to identify whether this request expect a stream responseAccept: text/event-stream
and finally return a mutated request body to envoy with stream field
Which issue(s) this PR fixes:
Fixes #209
Release Notes: No