Add support for partial response messages #101

NeonDaniel · 2024-12-05T02:47:36Z

Description

Adds support for multi-part responses in send_mq_request
Adds an optional callback parameter to handle partial responses

Issues

Needs to be rebased on dev after merge of Update tests to use temporary RMQ instance and improve coverage #104

Other Notes

The API for responses needs is defined in Add MQResponse model neon-data-models#7
TBD if this should specify/enforce that the "final" message always contain the full response
The primary motivation here is supporting streaming LLM responses, but the implementation should be flexible for other applications

…q_connector#101

Add test coverage for `send_mq_request` with `stream_callback`

Add test for `stream_callback` final response

…nchronously Add note to address synchronization of responses

…q_connector#101

NeonDaniel · 2024-12-18T17:30:52Z

tests/test_utils.py

+                                  routing_key=reply_channel,
+                                  body=dict_to_b64(response),
+                                  properties=pika.BasicProperties(expiration='1000'))
+            time.sleep(0.5)  # Used to ensure synchronous response handling


This is currently required because nothing accounts for responses arriving out of order

This is very bad, if rabbitMQ do not guarantee order of delivery, sleep isn't a reliable solution
For me better to send the whole response every time with small additions in the end
Then if they will not arrive in order the text will just appear and dissapear, but final responce will always be corrent

This is how I specified it in documentation, each response should be cumulative, rather than incremental (response 2 includes all of response 1 and more). The one with the _is_final parameter set to True will always be the complete response.

Each message does have an index, so we could enforce on the client side that the callback receives the messages in order..

NeonBohdan · 2024-12-18T19:34:01Z

neon_mq_connector/utils/client_utils.py

+                    # Always return final result
+                    response_data.update(api_output)
+            else:
+                response_data.update(api_output)


Does it mean that the only way to send objects is not chunk by chunk but only solid object but with partial additions to it's content?

That is the intended use case as documented in the README changes.

I implemented the stream callback as optional, so send_mq_request is expected to always return a complete result at the end.

Chunked responses would require specifying a standard method for combining them into a single complete response which is also valid, but would require some additional information in responses, i.e.:

how to combine response data

how to present incremental responses to a consumer

NeonKirill · 2024-12-18T19:41:23Z

neon_mq_connector/utils/client_utils.py

-            channel.close()
-            response_data.update(api_output)
-            response_event.set()
+            if isinstance(api_output.get('_part'), int):


Maybe we could try to achieve this behaviour using Streams? https://www.rabbitmq.com/docs/streams

I looked into streams briefly and it seemed to me they were more oriented at sending either larger quantities of data or broadcasting to multiple recipients based on RMQ documented use cases.

I opted for this method because it appears to be simpler to implement with fewer changes to our client helpers.

README.md

NeonDaniel force-pushed the FEAT_EnableMultipartResponses branch from dfd3503 to bc321bb Compare December 10, 2024 20:55

NeonDaniel added a commit to NeonGeckoCom/neon-data-models that referenced this pull request Dec 10, 2024

Add MQResponse model to define parameters used in NeonGeckoCom/neon_m…

2b402f4

…q_connector#101

NeonDaniel mentioned this pull request Dec 10, 2024

Add MQResponse model NeonGeckoCom/neon-data-models#7

Open

NeonDaniel added 3 commits December 11, 2024 13:26

WIP support for intermediate responses for LLM streaming

0c2245c

Fix handling of multi-part 0-indexed returns

34e2a84

Add test coverage for `send_mq_request` with `stream_callback`

Update documentation to describe multi-part response behavior

162510a

Add test for `stream_callback` final response

NeonDaniel force-pushed the FEAT_EnableMultipartResponses branch from 71d1685 to 162510a Compare December 11, 2024 21:26

Add time between multi-response emits to help responses be handled sy…

7b3d4d8

…nchronously Add note to address synchronization of responses

NeonDaniel added a commit to NeonGeckoCom/neon-data-models that referenced this pull request Dec 18, 2024

Add MQResponse model to define parameters used in NeonGeckoCom/neon_m…

d49266c

…q_connector#101

NeonDaniel commented Dec 18, 2024

View reviewed changes

NeonDaniel requested review from NeonKirill and NeonBohdan December 18, 2024 17:32

NeonBohdan reviewed Dec 18, 2024

View reviewed changes

NeonKirill reviewed Dec 18, 2024

View reviewed changes

NeonDaniel requested review from NeonBohdan and NeonKirill December 23, 2024 21:25

NeonBohdan reviewed Dec 24, 2024

View reviewed changes

README.md Show resolved Hide resolved

NeonBohdan reviewed Dec 24, 2024

View reviewed changes

README.md Outdated Show resolved Hide resolved

Update readme per review

d1a8ec9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for partial response messages #101

Add support for partial response messages #101

NeonDaniel commented Dec 5, 2024 •

edited

Loading

NeonDaniel Dec 18, 2024

NeonBohdan Dec 18, 2024 •

edited

Loading

NeonDaniel Dec 18, 2024

NeonBohdan Dec 18, 2024

NeonDaniel Dec 18, 2024

NeonKirill Dec 18, 2024

NeonDaniel Dec 18, 2024

Add support for partial response messages #101

Are you sure you want to change the base?

Add support for partial response messages #101

Conversation

NeonDaniel commented Dec 5, 2024 • edited Loading

Description

Issues

Other Notes

NeonDaniel Dec 18, 2024

Choose a reason for hiding this comment

NeonBohdan Dec 18, 2024 • edited Loading

Choose a reason for hiding this comment

NeonDaniel Dec 18, 2024

Choose a reason for hiding this comment

NeonBohdan Dec 18, 2024

Choose a reason for hiding this comment

NeonDaniel Dec 18, 2024

Choose a reason for hiding this comment

NeonKirill Dec 18, 2024

Choose a reason for hiding this comment

NeonDaniel Dec 18, 2024

Choose a reason for hiding this comment

NeonDaniel commented Dec 5, 2024 •

edited

Loading

NeonBohdan Dec 18, 2024 •

edited

Loading