feat(mock-server): add fallback tokenizer support #313

ajcasagrande · 2025-09-28T18:10:10Z

This addresses a concern seen by a few users. Originally it was designed without a fallback on purpose for error simulation, but that is not relevant anymore. We can also decide to provide failure scenarios in other ways in the future.

coderabbitai · 2025-09-28T18:10:17Z

Walkthrough

Adds a fallback tokenizer mechanism across the mock server: config introduces a fallback_tokenizer field and env var, tokenizer service can set and use a fallback, app startup/config endpoints load and set it, and docs describe selection order and updated failure conditions. Minor import warning suppression was added.

Changes

Cohort / File(s)	Summary
Docs: Fallback tokenizer behavior and config `integration-tests/README.md`	Documents fallback tokenizer mechanism, selection order, new env var `MOCK_SERVER_FALLBACK_TOKENIZER`, config field `fallback_tokenizer`, and updates terminology (e.g., `max_completion_tokens`).
Config: Server-level fallback setting `integration-tests/mock_server/config.py`	Adds `fallback_tokenizer: str` with default `"Qwen/Qwen3-0.6B"` to `MockServerConfig`.
Models: Configure message schema `integration-tests/mock_server/models.py`	Adds optional `fallback_tokenizer: str
Tokenizer service: Fallback logic and imports `integration-tests/mock_server/tokenizer_service.py`	Suppresses transformers import warnings; adds `_fallback_tokenizer`, `set_fallback_tokenizer(...)`; `get_tokenizer(...)` now falls back to configured tokenizer and errors only if both requested and fallback are unavailable.
App: Startup/init and configure endpoint `integration-tests/mock_server/app.py`	Consolidates tokenizer loading to include fallback; sets `MOCK_SERVER_FALLBACK_TOKENIZER` in env; configure endpoint can set and load a provided `fallback_tokenizer`.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Client
  participant App as Mock Server App
  participant TokSvc as TokenizerService
  participant HF as Transformers

  rect rgb(245,245,255)
    note over App,TokSvc: Startup initialization
    App->>TokSvc: load_tokenizers(server_config.tokenizer_models + fallback)
    App->>TokSvc: set_fallback_tokenizer(server_config.fallback_tokenizer)
    TokSvc->>HF: Import AutoTokenizer (warnings suppressed)
    TokSvc-->>App: Tokenizers loaded
  end

  rect rgb(245,255,245)
    note over Client,TokSvc: Request-time tokenizer selection
    Client->>App: Request with model X
    App->>TokSvc: get_tokenizer(X)
    alt Tokenizer X loaded
      TokSvc-->>App: tokenizer(X)
    else Fallback available
      TokSvc-->>App: tokenizer(fallback)
    else Neither available
      TokSvc-->>App: ValueError (not found: X and fallback)
      App-->>Client: 404
    end
    App-->>Client: Proceed with tokenization using selected tokenizer
  end

sequenceDiagram
  autonumber
  actor Client
  participant App as Mock Server App
  participant TokSvc as TokenizerService

  rect rgb(255,250,240)
    note over Client,App: Configure endpoint
    Client->>App: POST /configure { fallback_tokenizer?, tokenizer_models? }
    alt fallback_tokenizer provided
      App->>TokSvc: load_tokenizers([fallback_tokenizer])
      App->>TokSvc: set_fallback_tokenizer(fallback_tokenizer)
      TokSvc-->>App: ack
    else no fallback provided
      App-->>Client: ack (no fallback change)
    end
    App-->>Client: 200 Config applied
  end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

I twitched my ears at tokens’ fate,
When X went missing, don’t berate—
I hop to fallback, swift and clean,
A Qwen-ly net beneath the scene.
Now 404s are fewer, whee!
Carrots saved for you and me. 🥕✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title succinctly and accurately describes the primary enhancement introduced by this pull request, namely adding fallback tokenizer support to the mock server. It follows conventional “feat…” commit styling and clearly pinpoints the scope without unnecessary detail. It is concise, focused, and informative for teammates scanning the change history.
Docstring Coverage	✅ Passed	Docstring coverage is 88.89% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch ajc/mock

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2025-09-28T18:12:04Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

integration-tests/mock_server/app.py (1)
43-51: Avoid loading duplicate tokenizer entries

If the fallback tokenizer is already listed in server_config.tokenizer_models, this will load the same model twice. You can keep order while deduplicating by normalizing the list before passing it to the service.
-    tokenizer_models = [
-        *server_config.tokenizer_models,
-        server_config.fallback_tokenizer,
-    ]
+    tokenizer_models = list(
+        dict.fromkeys(
+            [*server_config.tokenizer_models, server_config.fallback_tokenizer]
+        )
+    )

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a58de00 and 3b49cb7.

📒 Files selected for processing (5)

integration-tests/README.md (9 hunks)
integration-tests/mock_server/app.py (3 hunks)
integration-tests/mock_server/config.py (1 hunks)
integration-tests/mock_server/models.py (1 hunks)
integration-tests/mock_server/tokenizer_service.py (4 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

integration-tests/mock_server/app.py (1)

integration-tests/mock_server/tokenizer_service.py (2)

set_fallback_tokenizer (72-74)

load_tokenizers (27-40)

🔇 Additional comments (3)

integration-tests/mock_server/config.py (1)

107-112: Fallback configuration wired cleanly

Field shape, description, and default tie in cleanly with the rest of the settings surface. Nice addition.

integration-tests/mock_server/models.py (1)

23-26: Model schema aligns with new config surface

Optional field makes sense here and keeps ConfigureMessage backwards compatible while exposing the new knob.

integration-tests/mock_server/app.py (1)

79-79: Good call adding the env propagation

Mirrors how the rest of the config is shared across workers; thanks for keeping that consistent.

coderabbitai · 2025-09-28T18:17:03Z

integration-tests/mock_server/app.py

+    if request.fallback_tokenizer is not None:
+        tokenizer_service.load_tokenizers([request.fallback_tokenizer])
+        tokenizer_service.set_fallback_tokenizer(request.fallback_tokenizer)
+        logger.info(f"Fallback tokenizer set to {request.fallback_tokenizer}")


⚠️ Potential issue | 🟠 Major

Persist fallback choice back into server_config

We load and set the new fallback on the service, but server_config.fallback_tokenizer stays at its previous value. As a result, /configure responses and /health still report the old fallback, and any code that later reads server_config.fallback_tokenizer (including multi-worker env propagation or reload flows) will drift from the active tokenizer. Please assign it so the in-memory config stays truthful.

if request.fallback_tokenizer is not None: tokenizer_service.load_tokenizers([request.fallback_tokenizer]) tokenizer_service.set_fallback_tokenizer(request.fallback_tokenizer) + server_config.fallback_tokenizer = request.fallback_tokenizer logger.info(f"Fallback tokenizer set to {request.fallback_tokenizer}")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if request.fallback_tokenizer is not None:

tokenizer_service.load_tokenizers([request.fallback_tokenizer])

tokenizer_service.set_fallback_tokenizer(request.fallback_tokenizer)

logger.info(f"Fallback tokenizer set to {request.fallback_tokenizer}")

if request.fallback_tokenizer is not None:

tokenizer_service.load_tokenizers([request.fallback_tokenizer])

tokenizer_service.set_fallback_tokenizer(request.fallback_tokenizer)

server_config.fallback_tokenizer = request.fallback_tokenizer

logger.info(f"Fallback tokenizer set to {request.fallback_tokenizer}")

🤖 Prompt for AI Agents

In integration-tests/mock_server/app.py around lines 144 to 147, the code sets the fallback on the tokenizer_service but doesn't update the in-memory server_config; set server_config.fallback_tokenizer = request.fallback_tokenizer after loading/setting the tokenizer so the server_config reflects the active fallback (ensure server_config is in scope or import it if needed).

coderabbitai · 2025-09-28T18:17:03Z

integration-tests/mock_server/tokenizer_service.py

        if model_name not in self._tokenizers:
-            raise ValueError(f"No tokenizer loaded for {model_name}")
+            if self._fallback_tokenizer not in self._tokenizers:
+                raise ValueError(
+                    f"No tokenizer loaded for {model_name} or {self._fallback_tokenizer}"
+                )
+            model_name = self._fallback_tokenizer



⚠️ Potential issue | 🟠 Major

Restore lazy tokenizer loading before falling back.

We now short-circuit to the fallback whenever the requested model key is missing, which means we never even try to lazily load the requested tokenizer anymore. In the current server flows that rely on lazy loading (run without preloading, runtime configure updates, etc.), this silently swaps responses to the fallback tokenizer or just raises when no fallback is configured—a regression from today’s behavior. Please attempt to load the requested tokenizer first and only fall back when that load really fails, while also bootstrapping the fallback if it hasn’t been loaded yet.

- if model_name not in self._tokenizers: - if self._fallback_tokenizer not in self._tokenizers: - raise ValueError( - f"No tokenizer loaded for {model_name} or {self._fallback_tokenizer}" - ) - model_name = self._fallback_tokenizer + if model_name not in self._tokenizers: + try: + logger.info(f"Lazy-loading tokenizer for model: {model_name}") + self._tokenizers[model_name] = AutoTokenizer.from_pretrained( + model_name, trust_remote_code=True + ) + except Exception as exc: + fallback = self._fallback_tokenizer + if not fallback: + raise ValueError( + f"No tokenizer loaded for {model_name}" + ) from exc + if fallback not in self._tokenizers: + logger.info(f"Lazy-loading fallback tokenizer: {fallback}") + self._tokenizers[fallback] = AutoTokenizer.from_pretrained( + fallback, trust_remote_code=True + ) + model_name = fallback

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if model_name not in self._tokenizers:

raise ValueError(f"No tokenizer loaded for {model_name}")

if self._fallback_tokenizer not in self._tokenizers:

raise ValueError(

f"No tokenizer loaded for {model_name} or {self._fallback_tokenizer}"

)

model_name = self._fallback_tokenizer

if model_name not in self._tokenizers:

try:

logger.info(f"Lazy-loading tokenizer for model: {model_name}")

self._tokenizers[model_name] = AutoTokenizer.from_pretrained(

model_name, trust_remote_code=True

)

except Exception as exc:

fallback = self._fallback_tokenizer

if not fallback:

raise ValueError(

f"No tokenizer loaded for {model_name}"

) from exc

if fallback not in self._tokenizers:

logger.info(f"Lazy-loading fallback tokenizer: {fallback}")

self._tokenizers[fallback] = AutoTokenizer.from_pretrained(

fallback, trust_remote_code=True

)

model_name = fallback

the-david-oy

Great job catching this! This feature will be helpful.

integration-tests/mock_server/app.py

the-david-oy · 2025-09-29T21:29:11Z

integration-tests/mock_server/app.py

        logger.info("Tokenizer models loaded successfully")
-
+    if request.fallback_tokenizer is not None:
+        tokenizer_service.load_tokenizers([request.fallback_tokenizer])


Is this always supposed to be called, even if the tokenizer specified exists?

yeah i need to add a check to see if its already been loaded. Code rabbit brought that up a few tim4es.

feat(mock-server): add fallback tokenizer support

3b49cb7

github-actions bot added the feat label Sep 28, 2025

ajcasagrande requested review from matthewkotila and the-david-oy September 28, 2025 18:10

coderabbitai bot reviewed Sep 28, 2025

View reviewed changes

ajcasagrande marked this pull request as draft September 28, 2025 19:29

the-david-oy reviewed Sep 29, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(mock-server): add fallback tokenizer support #313

feat(mock-server): add fallback tokenizer support #313

Uh oh!

ajcasagrande commented Sep 28, 2025 •

edited

Loading

Uh oh!

coderabbitai bot commented Sep 28, 2025 •

edited

Loading

Uh oh!

codecov bot commented Sep 28, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Sep 28, 2025

Uh oh!

coderabbitai bot Sep 28, 2025

Uh oh!

the-david-oy left a comment

Uh oh!

Uh oh!

the-david-oy Sep 29, 2025

Uh oh!

ajcasagrande Sep 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

-        if model_name not in self._tokenizers:
-            raise ValueError(f"No tokenizer loaded for {model_name}")
-            if self._fallback_tokenizer not in self._tokenizers:
-                raise ValueError(
-                    f"No tokenizer loaded for {model_name} or {self._fallback_tokenizer}"
-                )
-            model_name = self._fallback_tokenizer
+        if model_name not in self._tokenizers:
+            try:
+                logger.info(f"Lazy-loading tokenizer for model: {model_name}")
+                self._tokenizers[model_name] = AutoTokenizer.from_pretrained(
+                    model_name, trust_remote_code=True
+                )
+            except Exception as exc:
+                fallback = self._fallback_tokenizer
+                if not fallback:
+                    raise ValueError(
+                        f"No tokenizer loaded for {model_name}"
+                    ) from exc
+                if fallback not in self._tokenizers:
+                    logger.info(f"Lazy-loading fallback tokenizer: {fallback}")
+                    self._tokenizers[fallback] = AutoTokenizer.from_pretrained(
+                        fallback, trust_remote_code=True
+                    )
+                model_name = fallback

feat(mock-server): add fallback tokenizer support #313

Are you sure you want to change the base?

feat(mock-server): add fallback tokenizer support #313

Uh oh!

Conversation

ajcasagrande commented Sep 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot commented Sep 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

codecov bot commented Sep 28, 2025

Codecov Report

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 28, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 28, 2025

Choose a reason for hiding this comment

Uh oh!

the-david-oy left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

the-david-oy Sep 29, 2025

Choose a reason for hiding this comment

Uh oh!

ajcasagrande Sep 29, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ajcasagrande commented Sep 28, 2025 •

edited

Loading

coderabbitai bot commented Sep 28, 2025 •

edited

Loading