-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Fix chunk order issue in native field streaming #8892
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes a chunk order issue in native field streaming where chunks could be reordered by up to 10 positions when native response chunks are included in string streaming due to buffering logic.
- Modifies streaming listener to flush buffer when empty chunks with native fields are received
- Reorders listeners to process active buffering listeners first, ensuring correct chunk order
- Adds test coverage to verify the fix works correctly
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
File | Description |
---|---|
tests/streaming/test_streaming.py | Adds test assertions to verify chunk order is maintained correctly |
dspy/streaming/streaming_listener.py | Implements buffer flushing for empty chunks and refactors token extraction |
dspy/streaming/streamify.py | Reorders listeners to prioritize active buffering listeners |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
# If we receive an empty chunk but streaming has started, flush the buffer. | ||
# LiteLLM does not send completely empty chunks (https://github.com/BerriAI/litellm/blob/main/litellm/litellm_core_utils/model_response_utils.py#L10), | ||
# so empty content means it has other native fields such as provider_specific_fields. | ||
if not chunk_message: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this correct? From the log shared internally, the LM can produce citation chunks in between of field chunks, then it could lead to unintended flush?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Flush is for finding the end token, right? It is true that the native chunk is passed between field chunks, when the citation chunk is passed, the end token (like [[ ##
) shouldn't be in the queue. So it's a safe time to flush the tokens for the ongoing string field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not entirely sure about the LM streaming order, like how it tangles normal text with native features/events, but if it looks like:
- text: what
- text: the
- citation:
- text: jesus?
- text: [[ ##
- citation:
- text: completed ## ]]
Then at step 6, [[ ##
will be yielded by flush call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that example, does it mean completed ## ]]
is supported by the citation #5? I'm assuming that LM produces the native response and string response in the right order. If it makes mistakes, yeah it won't work well.
The current streaming listener implementation has an issue that chunk order is modified (by at most 10 chunks) when a native response chunk is included in the string streaming due to the buffering logic for the string output field. This PR fixes the issue by flushing the chunks once when the native response chunk is received.