Skip to content

Conversation

@mfittko
Copy link

@mfittko mfittko commented Sep 25, 2025

Fixes BerriAI#14891

Upstream issue: BerriAI#14891

Summary

  • Problem: /v1/audio/speech (OpenAI TTS) via the proxy buffered or yielded zero bytes, preventing progressive playback.
  • Root cause: OpenAI TTS adapter returned an object tied to a closed streaming context; proxy generator iterated the stream incorrectly.
  • Fix: defer opening the OpenAI streaming context until aiter_bytes() consumption and iterate the async iterator correctly. Keep sync speech() unchanged. Add a small script to verify streaming behavior.

Changes

  • OpenAI provider (async TTS): return a deferred streaming object whose aiter_bytes() opens with_streaming_response.create(...) and yields chunks.
  • Proxy generator: ensure we iterate bytes correctly from the returned stream.
  • Scripts: add scripts/verify_tts_streaming.py to validate headers, time-to-first-byte, and bytes received.
  • Tests: add tests/litellm/test_tts_deferred_streaming.py to minimally validate deferred streaming yields bytes (skipped if async plugin missing).

Checklist

  • I have added at least one test under tests/litellm/
  • I have attached/provided local test results
  • My PR scope is isolated and solves one problem (TTS streaming)

Notes

  • For browser clients, use response.body streaming (ReadableStream) instead of arrayBuffer()/blob() for progressive TTS playback.

…ing context managers, allowing for efficient byte iteration without buffering.
…g for efficient byte iteration without prematurely closing the upstream stream. This change enhances the async audio speech functionality while maintaining compatibility with existing synchronous behavior.
@mfittko mfittko self-assigned this Sep 25, 2025
@mfittko mfittko marked this pull request as ready for review September 25, 2025 12:30
@mfittko mfittko marked this pull request as draft September 25, 2025 12:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: OpenAI TTS via proxy /v1/audio/speech buffers instead of streaming

2 participants