-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split communication logic from computation logic into orchestrator #3118
Conversation
…ager # Conflicts: # python/sglang/srt/openai_api/adapter.py
# Conflicts: # python/sglang/srt/managers/detokenizer_manager.py
# Conflicts: # python/sglang/srt/entrypoints/engine.py # python/sglang/srt/managers/generation_manager.py # python/sglang/srt/managers/tokenizer_manager.py
# Conflicts: # python/sglang/srt/entrypoints/engine.py # python/sglang/srt/entrypoints/http_server.py # python/sglang/srt/managers/scheduler.py
LGTM except on minor comment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Others LGTM
…into feat/separate_comm
self.send_to_detokenizer.send_pyobj( | ||
self.on_generation_output( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we check self.on_generation_output
before using it? e.g. Add an error message if self.on_generation_output
is None
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems Scheduler class is an internal implementation detail and not user-facing API, so I personally feel both adding and not adding is OK.
After discussing with Lianmin, now I know there is no need to follow implementation requirements in e.g. 2736. Thus I spent several hours writing a new PR #3852, and this one is deprecated. |
Motivation
Currently, the managers do both communication (e.g. "send to another process using mq") and computation (e.g. "do the detokenization"). This PR separates them.
Moreover, by doing this refactor, adding SPMD logic would be a bit easier and cleaner.
Modifications
Checklist