-
Notifications
You must be signed in to change notification settings - Fork 2
feat(mock-server): add fallback tokenizer support #313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughAdds a fallback tokenizer mechanism across the mock server: config introduces a fallback_tokenizer field and env var, tokenizer service can set and use a fallback, app startup/config endpoints load and set it, and docs describe selection order and updated failure conditions. Minor import warning suppression was added. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor Client
participant App as Mock Server App
participant TokSvc as TokenizerService
participant HF as Transformers
rect rgb(245,245,255)
note over App,TokSvc: Startup initialization
App->>TokSvc: load_tokenizers(server_config.tokenizer_models + fallback)
App->>TokSvc: set_fallback_tokenizer(server_config.fallback_tokenizer)
TokSvc->>HF: Import AutoTokenizer (warnings suppressed)
TokSvc-->>App: Tokenizers loaded
end
rect rgb(245,255,245)
note over Client,TokSvc: Request-time tokenizer selection
Client->>App: Request with model X
App->>TokSvc: get_tokenizer(X)
alt Tokenizer X loaded
TokSvc-->>App: tokenizer(X)
else Fallback available
TokSvc-->>App: tokenizer(fallback)
else Neither available
TokSvc-->>App: ValueError (not found: X and fallback)
App-->>Client: 404
end
App-->>Client: Proceed with tokenization using selected tokenizer
end
sequenceDiagram
autonumber
actor Client
participant App as Mock Server App
participant TokSvc as TokenizerService
rect rgb(255,250,240)
note over Client,App: Configure endpoint
Client->>App: POST /configure { fallback_tokenizer?, tokenizer_models? }
alt fallback_tokenizer provided
App->>TokSvc: load_tokenizers([fallback_tokenizer])
App->>TokSvc: set_fallback_tokenizer(fallback_tokenizer)
TokSvc-->>App: ack
else no fallback provided
App-->>Client: ack (no fallback change)
end
App-->>Client: 200 Config applied
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (1)
integration-tests/mock_server/app.py (1)
43-51
: Avoid loading duplicate tokenizer entriesIf the fallback tokenizer is already listed in
server_config.tokenizer_models
, this will load the same model twice. You can keep order while deduplicating by normalizing the list before passing it to the service.- tokenizer_models = [ - *server_config.tokenizer_models, - server_config.fallback_tokenizer, - ] + tokenizer_models = list( + dict.fromkeys( + [*server_config.tokenizer_models, server_config.fallback_tokenizer] + ) + )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
integration-tests/README.md
(9 hunks)integration-tests/mock_server/app.py
(3 hunks)integration-tests/mock_server/config.py
(1 hunks)integration-tests/mock_server/models.py
(1 hunks)integration-tests/mock_server/tokenizer_service.py
(4 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
integration-tests/mock_server/app.py (1)
integration-tests/mock_server/tokenizer_service.py (2)
set_fallback_tokenizer
(72-74)load_tokenizers
(27-40)
🔇 Additional comments (3)
integration-tests/mock_server/config.py (1)
107-112
: Fallback configuration wired cleanlyField shape, description, and default tie in cleanly with the rest of the settings surface. Nice addition.
integration-tests/mock_server/models.py (1)
23-26
: Model schema aligns with new config surfaceOptional field makes sense here and keeps ConfigureMessage backwards compatible while exposing the new knob.
integration-tests/mock_server/app.py (1)
79-79
: Good call adding the env propagationMirrors how the rest of the config is shared across workers; thanks for keeping that consistent.
if request.fallback_tokenizer is not None: | ||
tokenizer_service.load_tokenizers([request.fallback_tokenizer]) | ||
tokenizer_service.set_fallback_tokenizer(request.fallback_tokenizer) | ||
logger.info(f"Fallback tokenizer set to {request.fallback_tokenizer}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Persist fallback choice back into server_config
We load and set the new fallback on the service, but server_config.fallback_tokenizer
stays at its previous value. As a result, /configure
responses and /health
still report the old fallback, and any code that later reads server_config.fallback_tokenizer
(including multi-worker env propagation or reload flows) will drift from the active tokenizer. Please assign it so the in-memory config stays truthful.
if request.fallback_tokenizer is not None:
tokenizer_service.load_tokenizers([request.fallback_tokenizer])
tokenizer_service.set_fallback_tokenizer(request.fallback_tokenizer)
+ server_config.fallback_tokenizer = request.fallback_tokenizer
logger.info(f"Fallback tokenizer set to {request.fallback_tokenizer}")
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
if request.fallback_tokenizer is not None: | |
tokenizer_service.load_tokenizers([request.fallback_tokenizer]) | |
tokenizer_service.set_fallback_tokenizer(request.fallback_tokenizer) | |
logger.info(f"Fallback tokenizer set to {request.fallback_tokenizer}") | |
if request.fallback_tokenizer is not None: | |
tokenizer_service.load_tokenizers([request.fallback_tokenizer]) | |
tokenizer_service.set_fallback_tokenizer(request.fallback_tokenizer) | |
server_config.fallback_tokenizer = request.fallback_tokenizer | |
logger.info(f"Fallback tokenizer set to {request.fallback_tokenizer}") |
🤖 Prompt for AI Agents
In integration-tests/mock_server/app.py around lines 144 to 147, the code sets
the fallback on the tokenizer_service but doesn't update the in-memory
server_config; set server_config.fallback_tokenizer = request.fallback_tokenizer
after loading/setting the tokenizer so the server_config reflects the active
fallback (ensure server_config is in scope or import it if needed).
if model_name not in self._tokenizers: | ||
raise ValueError(f"No tokenizer loaded for {model_name}") | ||
if self._fallback_tokenizer not in self._tokenizers: | ||
raise ValueError( | ||
f"No tokenizer loaded for {model_name} or {self._fallback_tokenizer}" | ||
) | ||
model_name = self._fallback_tokenizer | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Restore lazy tokenizer loading before falling back.
We now short-circuit to the fallback whenever the requested model key is missing, which means we never even try to lazily load the requested tokenizer anymore. In the current server flows that rely on lazy loading (run without preloading, runtime configure updates, etc.), this silently swaps responses to the fallback tokenizer or just raises when no fallback is configured—a regression from today’s behavior. Please attempt to load the requested tokenizer first and only fall back when that load really fails, while also bootstrapping the fallback if it hasn’t been loaded yet.
- if model_name not in self._tokenizers:
- if self._fallback_tokenizer not in self._tokenizers:
- raise ValueError(
- f"No tokenizer loaded for {model_name} or {self._fallback_tokenizer}"
- )
- model_name = self._fallback_tokenizer
+ if model_name not in self._tokenizers:
+ try:
+ logger.info(f"Lazy-loading tokenizer for model: {model_name}")
+ self._tokenizers[model_name] = AutoTokenizer.from_pretrained(
+ model_name, trust_remote_code=True
+ )
+ except Exception as exc:
+ fallback = self._fallback_tokenizer
+ if not fallback:
+ raise ValueError(
+ f"No tokenizer loaded for {model_name}"
+ ) from exc
+ if fallback not in self._tokenizers:
+ logger.info(f"Lazy-loading fallback tokenizer: {fallback}")
+ self._tokenizers[fallback] = AutoTokenizer.from_pretrained(
+ fallback, trust_remote_code=True
+ )
+ model_name = fallback
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
if model_name not in self._tokenizers: | |
raise ValueError(f"No tokenizer loaded for {model_name}") | |
if self._fallback_tokenizer not in self._tokenizers: | |
raise ValueError( | |
f"No tokenizer loaded for {model_name} or {self._fallback_tokenizer}" | |
) | |
model_name = self._fallback_tokenizer | |
if model_name not in self._tokenizers: | |
try: | |
logger.info(f"Lazy-loading tokenizer for model: {model_name}") | |
self._tokenizers[model_name] = AutoTokenizer.from_pretrained( | |
model_name, trust_remote_code=True | |
) | |
except Exception as exc: | |
fallback = self._fallback_tokenizer | |
if not fallback: | |
raise ValueError( | |
f"No tokenizer loaded for {model_name}" | |
) from exc | |
if fallback not in self._tokenizers: | |
logger.info(f"Lazy-loading fallback tokenizer: {fallback}") | |
self._tokenizers[fallback] = AutoTokenizer.from_pretrained( | |
fallback, trust_remote_code=True | |
) | |
model_name = fallback |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job catching this! This feature will be helpful.
logger.info("Tokenizer models loaded successfully") | ||
|
||
if request.fallback_tokenizer is not None: | ||
tokenizer_service.load_tokenizers([request.fallback_tokenizer]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this always supposed to be called, even if the tokenizer specified exists?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah i need to add a check to see if its already been loaded. Code rabbit brought that up a few tim4es.
This addresses a concern seen by a few users. Originally it was designed without a fallback on purpose for error simulation, but that is not relevant anymore. We can also decide to provide failure scenarios in other ways in the future.