-
Notifications
You must be signed in to change notification settings - Fork 2
Implement wire protocol version compatibility semantics #113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
conradbzura
wants to merge
7
commits into
main
Choose a base branch
from
protobuf-version-handshake
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
0a5d224
build: Add version fields to proto schemas
conradbzura 3d72c09
feat: Add version to task serialization and Nack handling
conradbzura 55913c3
feat: Add discovery-time major version filter
conradbzura df13abd
feat: Add gRPC version interceptor for dispatch
conradbzura 6505a21
test: Add wire protocol version compatibility tests
conradbzura aa56dcf
docs: Add protobuf subpackage README
conradbzura bc1c4b7
ci: Fetch tags in checkout step for version resolution
conradbzura File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,84 @@ | ||
| # Wire protocol | ||
|
|
||
| Wool uses a binary wire protocol built on Protocol Buffers and gRPC | ||
| for all communication between clients and workers. | ||
|
|
||
| ## Dispatch sequence | ||
|
|
||
| The `Worker.dispatch` RPC uses a server-streaming pattern. The client | ||
| sends a single `Task` message and receives a stream of `Response` | ||
| messages: | ||
|
|
||
| ``` | ||
| Client Worker | ||
| | | | ||
| |── Task ──────────────────────>| | ||
| | | | ||
| |<──────── Response(Ack) ───────| (or Nack on rejection) | ||
| |<──────── Response(Result) ────| (one or more results) | ||
| |<──────── Response(Exception) ─| (on failure) | ||
| | | | ||
| ``` | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This could be a simple Mermaid sequence diagram. |
||
|
|
||
| ### Response types | ||
|
|
||
| 1. **Ack** — The worker accepted the task and started processing. | ||
| Carries the worker's `version` string for observability. | ||
| 2. **Nack** — The worker rejected the task. The `reason` field | ||
| describes why (e.g., major version mismatch, unparseable version). | ||
| No further responses follow a Nack. | ||
| 3. **Result** — A cloudpickle-serialized return value. Coroutine | ||
| tasks yield exactly one result; async generator tasks yield one | ||
| per iteration. | ||
| 4. **Exception** — A cloudpickle-serialized exception from the | ||
| remote execution. Terminates the stream. | ||
|
|
||
| ## Serialization | ||
|
|
||
| Wool uses a hybrid serialization approach: | ||
|
|
||
| - **Protobuf envelope** — Structured metadata fields (`id`, | ||
| `version`, `caller`, `timeout`, etc.) are native protobuf fields | ||
| for efficient parsing and forward compatibility. | ||
| - **cloudpickle payloads** — The `callable`, `args`, `kwargs`, and | ||
| `proxy` fields are serialized with cloudpickle and stored as | ||
| `bytes` fields. This allows arbitrary Python objects to be | ||
| transmitted without schema changes. | ||
| - **Results and exceptions** — `Result.dump` and `Exception.dump` | ||
| are cloudpickle-serialized bytes. | ||
|
|
||
| ## Version compatibility | ||
|
|
||
| Wool enforces major-version compatibility at two layers. | ||
|
|
||
| ### Discovery-time filtering | ||
|
|
||
| `WorkerProxy` applies a version filter during worker discovery. | ||
| Workers whose major version differs from the client's are excluded | ||
| from the load balancer and never receive tasks. | ||
|
|
||
| ### Dispatch-time interception | ||
|
|
||
| `VersionInterceptor` is a gRPC server interceptor that extracts the | ||
| version field from raw request bytes *before* full deserialization. | ||
| This uses `TaskVersionEnvelope` — a minimal protobuf message | ||
| containing only `string version = 1` — which can parse field 1 from | ||
| any `Task` wire format, including future incompatible versions. | ||
|
|
||
| Requests with empty, missing, or unparseable version fields are | ||
| rejected with a `Nack` response. If the client's major version | ||
| differs from the worker's, the interceptor yields a `Nack` without | ||
| attempting full deserialization. This prevents deserialization errors | ||
| when the wire format has changed across major versions. | ||
|
|
||
| ## Schema evolution rules | ||
|
|
||
| - **Additive-only within a major version.** New fields may be | ||
| appended with new field numbers. Existing field numbers and types | ||
| must not change within the same major version. | ||
| - **Major version = wire compatibility boundary.** A major version | ||
| bump permits breaking changes to the protobuf schema (field | ||
| renumbering, type changes, removal). | ||
| - **Field 1 is always `version`.** The `Task` message reserves | ||
| field 1 for the version string. This invariant enables | ||
| pre-deserialization version extraction via `TaskVersionEnvelope`. | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,80 @@ | ||
| from __future__ import annotations | ||
|
|
||
| import grpc | ||
| import grpc.aio | ||
|
|
||
| import wool | ||
| from wool.runtime import protobuf as pb | ||
| from wool.runtime.worker.proxy import parse_major_version | ||
|
|
||
|
|
||
| class VersionInterceptor(grpc.aio.ServerInterceptor): | ||
| """gRPC server interceptor for wire protocol version checking. | ||
|
|
||
| Intercepts the ``dispatch`` RPC to extract the client version from | ||
| field 1 of the raw request bytes using | ||
| :class:`~wool.runtime.protobuf.task.TaskVersionEnvelope`. If the | ||
| client major version differs from the local worker major version, | ||
| the RPC is short-circuited with a | ||
| :class:`~wool.runtime.protobuf.worker.Nack` response. | ||
|
|
||
| Requests with empty, missing, or unparseable version fields are | ||
| rejected. | ||
| """ | ||
|
|
||
| async def intercept_service(self, continuation, handler_call_details): | ||
| handler = await continuation(handler_call_details) | ||
| if handler is None or not handler_call_details.method.endswith("/dispatch"): | ||
| return handler | ||
|
|
||
| original_handler = handler.unary_stream | ||
| original_deserializer = handler.request_deserializer | ||
| assert original_handler is not None | ||
| assert original_deserializer is not None | ||
|
|
||
| async def version_checked_handler(request_bytes, context): | ||
| envelope = pb.task.TaskVersionEnvelope() | ||
| try: | ||
| envelope.ParseFromString(request_bytes) | ||
| except Exception: | ||
| yield pb.worker.Response( | ||
| nack=pb.worker.Nack(reason="Failed to parse version envelope") | ||
| ) | ||
| return | ||
|
|
||
| client_major = parse_major_version(envelope.version) | ||
| local_major = parse_major_version(wool.__version__) | ||
|
|
||
| if client_major is None or local_major is None: | ||
| yield pb.worker.Response( | ||
| nack=pb.worker.Nack( | ||
| reason=( | ||
| f"Unparseable version: " | ||
| f"client={envelope.version!r}, " | ||
| f"worker={wool.__version__!r}" | ||
| ) | ||
| ) | ||
| ) | ||
| return | ||
|
|
||
| if client_major != local_major: | ||
| yield pb.worker.Response( | ||
| nack=pb.worker.Nack( | ||
| reason=( | ||
| f"Major version mismatch: " | ||
| f"client={envelope.version}, " | ||
| f"worker={wool.__version__}" | ||
| ) | ||
| ) | ||
| ) | ||
| return | ||
|
|
||
| request = original_deserializer(request_bytes) | ||
| async for response in original_handler(request, context): # pyright: ignore[reportGeneralTypeIssues] | ||
| yield response | ||
|
|
||
| return grpc.unary_stream_rpc_method_handler( | ||
| version_checked_handler, | ||
| request_deserializer=None, | ||
| response_serializer=handler.response_serializer, | ||
| ) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.