Skip to content

fix: unify qwen tts cache dir for tokenizer loading on Windows#218

Open
seidenbergerscott wants to merge 1 commit intojamiepine:mainfrom
seidenbergerscott:fix/windows-hf-cache-split
Open

fix: unify qwen tts cache dir for tokenizer loading on Windows#218
seidenbergerscott wants to merge 1 commit intojamiepine:mainfrom
seidenbergerscott:fix/windows-hf-cache-split

Conversation

@seidenbergerscott
Copy link

@seidenbergerscott seidenbergerscott commented Feb 28, 2026

Summary

  • route Qwen TTS rom_pretrained through a single HF cache root
  • pass cache_dir=HF_HUB_CACHE to avoid split cache between hub and transformers
  • switch deprecated orch_dtype arg to dtype for compatibility

Why

On Windows local setups, model assets can split between .hf-cache/hub and .hf-cache/transformers, causing speech_tokenizer/preprocessor_config.json load errors and 500s during generation.

Summary by CodeRabbit

  • Bug Fixes
    • Resolved Text-to-Speech model loading inconsistencies by standardizing cache directory management across all computational environments.

@coderabbitai
Copy link

coderabbitai bot commented Feb 28, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 38bf96f and e1c6bba.

📒 Files selected for processing (1)
  • backend/backends/pytorch_backend.py

📝 Walkthrough

Walkthrough

The PyTorchTTSBackend model loading is updated to centralize Hugging Face Hub cache directory handling by importing HF_HUB_CACHE and applying it to model loading calls. Additionally, the torch_dtype parameter is replaced with dtype, specifying float32 for CPU and bfloat16 for GPU paths.

Changes

Cohort / File(s) Summary
TTS Model Loading Configuration
backend/backends/pytorch_backend.py
Added HF_HUB_CACHE import and tts_cache_dir variable; applied cache_dir parameter to Qwen3TTSModel.from_pretrained calls in both CPU and non-CPU branches; replaced torch_dtype with dtype (float32 for CPU, bfloat16 for non-CPU).

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐰 A cache most organized, now takes its place,
Through Hub's own paths, we set the pace,
With dtype specified, both paths aligned,
Float and bfloat, perfectly designed! 🎉

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main change: unifying the Qwen TTS cache directory to fix a Windows-specific tokenizer loading issue.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.


Comment @coderabbitai help to get the list of available commands and usage tips.

@seidenbergerscott
Copy link
Author

Related issues: #217, #216

This PR fixes a Windows cache split between Hugging Face hub and transformers cache paths that can cause model load failures (missing speech_tokenizer/preprocessor_config.json) and generation 500s after download.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant