Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for partial response messages #101

Draft
wants to merge 5 commits into
base: dev
Choose a base branch
from

Conversation

NeonDaniel
Copy link
Member

@NeonDaniel NeonDaniel commented Dec 5, 2024

Description

Adds support for multi-part responses in send_mq_request
Adds an optional callback parameter to handle partial responses

Issues

Other Notes

  • The API for responses needs is defined in Add MQResponse model neon-data-models#7
  • TBD if this should specify/enforce that the "final" message always contain the full response
  • The primary motivation here is supporting streaming LLM responses, but the implementation should be flexible for other applications

@NeonDaniel NeonDaniel force-pushed the FEAT_EnableMultipartResponses branch from 71d1685 to 162510a Compare December 11, 2024 21:26
…nchronously

Add note to address synchronization of responses
NeonDaniel added a commit to NeonGeckoCom/neon-data-models that referenced this pull request Dec 18, 2024
routing_key=reply_channel,
body=dict_to_b64(response),
properties=pika.BasicProperties(expiration='1000'))
time.sleep(0.5) # Used to ensure synchronous response handling
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is currently required because nothing accounts for responses arriving out of order

Copy link

@NeonBohdan NeonBohdan Dec 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very bad, if rabbitMQ do not guarantee order of delivery, sleep isn't a reliable solution
For me better to send the whole response every time with small additions in the end
Then if they will not arrive in order the text will just appear and dissapear, but final responce will always be corrent

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is how I specified it in documentation, each response should be cumulative, rather than incremental (response 2 includes all of response 1 and more). The one with the _is_final parameter set to True will always be the complete response.

Each message does have an index, so we could enforce on the client side that the callback receives the messages in order..

# Always return final result
response_data.update(api_output)
else:
response_data.update(api_output)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it mean that the only way to send objects is not chunk by chunk but only solid object but with partial additions to it's content?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is the intended use case as documented in the README changes.

I implemented the stream callback as optional, so send_mq_request is expected to always return a complete result at the end.

Chunked responses would require specifying a standard method for combining them into a single complete response which is also valid, but would require some additional information in responses, i.e.:

  • how to combine response data
  • how to present incremental responses to a consumer

channel.close()
response_data.update(api_output)
response_event.set()
if isinstance(api_output.get('_part'), int):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could try to achieve this behaviour using Streams? https://www.rabbitmq.com/docs/streams

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked into streams briefly and it seemed to me they were more oriented at sending either larger quantities of data or broadcasting to multiple recipients based on RMQ documented use cases.

I opted for this method because it appears to be simpler to implement with fewer changes to our client helpers.

README.md Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants