-
Notifications
You must be signed in to change notification settings - Fork 8
[EAGLE-5341] Remove model_proto caching from ModelRunner, ModelServicer, and server #838
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
- Modified model_runner.py to merge cached model_proto with request.model - Modified model_servicer.py to merge cached model_proto with request.model - Changed from conditional CopyFrom to unconditional MergeFrom pattern - Added comprehensive unit tests in test_model_proto_merge.py - All tests passing, linting clean Co-authored-by: wemoveon2 <86081107+wemoveon2@users.noreply.github.com>
|
@copilot why not something like if self.model_proto is not None: |
- Changed from always creating temp model to explicit if-else branches - When request has no model: directly copy cached proto (unchanged) - When request has model: create merged_model, copy cached, merge request - More readable code that separates the two logical paths - All tests passing, no functional changes Co-authored-by: wemoveon2 <86081107+wemoveon2@users.noreply.github.com>
Good suggestion! Updated to use the explicit if-else pattern for better readability. The two code paths are now clearly separated:
Changes in commit ccea79e. All tests still passing. |
|
@copilot describe the behaviors tested in the tests you added |
The tests validate two key scenarios across different code paths: Scenario 1: Empty request (no model proto)
Scenario 2: Request with minimal model proto (the bug case)
Each test also verifies the operation succeeds (SUCCESS status code) and covers different API methods (predict, generate, stream). |
|
@copilot drop passing of model proto entirely. the request will contain a model proto if it needs it, does not need anything on the sdk side. remove the usage of the cached model proto from both the model runner and grpc servicer |
- Removed model_proto parameter from ModelRunner.__init__ - Removed model_proto parameter from ModelServicer.__init__ - Removed all model proto merging/copying logic from both classes - Removed model_proto parameter from pmo_iterator function - Removed resources_pb2 import from model_servicer.py (no longer needed) - Deleted test_model_proto_merge.py test file - Request will contain model proto directly when needed per requirements Co-authored-by: wemoveon2 <86081107+wemoveon2@users.noreply.github.com>
Removed all model_proto caching logic as requested. Changes in commit 6730513:
|
|
@copilot we need to stop passing the model proto in server.py |
- Removed model_proto argument from ModelServicer initialization - Removed model_proto argument from ModelRunner initialization - Server no longer passes cached model proto to servicer or runner - Aligns with approach that requests contain model proto directly Co-authored-by: wemoveon2 <86081107+wemoveon2@users.noreply.github.com>
Minimum allowed line rate is |
Why
How
model_protoparameter fromModelRunner.__init__andModelServicer.__init__self.model_protoinstance variable from both classesModelRunner:runner_item_predict(),runner_item_generate(),pmo_iterator()ModelServicer:PostModelOutputs(),GenerateModelOutputs(),StreamModelOutputs()resources_pb2import frommodel_servicer.py(no longer needed)model_protoargument fromModelServicerandModelRunnerinitialization inserver.pytests/runners/test_model_proto_merge.py(tested removed functionality)Tests
Notes
model_protoparameterOriginal prompt
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.