Skip to content

Conversation

@mfittko
Copy link

@mfittko mfittko commented May 19, 2025

Overview

  • Proxy: Centralized error handling with enhanced OpenAI exception mapping and parsing for clearer logs and standardized responses.
  • Proxy: Rename function from image_generation to moderation in the moderations endpoint; change call type from audio_speech to pass_through_endpoint for accurate logging/metrics.
  • OpenAI (audio speech): Add streaming support via context managers and implement deferred streaming to avoid prematurely closing upstream streams; maintains sync compatibility while improving async performance.
  • CI: Add .github/workflows/sofatutor_image.yml and remove other workflows on this branch to keep only the Sofatutor image workflow.

Based on PR #4.

Changes

  • litellm/proxy/proxy_server.py
    • Centralized error handling for proxy exceptions
    • Map/parse OpenAI errors to improve logging and response formatting
    • Rename moderations function image_generationmoderation
    • Update call type audio_speechpass_through_endpoint
  • litellm/llms/openai/openai.py
    • Support streaming responses using context managers for efficient byte iteration
    • Implement deferred streaming to prevent premature upstream close; keeps sync behavior intact while enhancing async
  • .github/workflows/sofatutor_image.yml
    • New workflow dedicated to Sofatutor image build/publish

Rationale

  • Improve reliability and clarity of proxy error handling and observability
  • Align function and call type naming with actual behavior for better analytics
  • Enable robust audio speech streaming paths with low memory usage and fewer edge-case failures
  • Keep CI minimal/specific for Sofatutor image builds on this branch

Changelog

  • Implement deferred streaming for OpenAI audio speech methods
  • Enhance OpenAI audio speech methods to support streaming via context managers
  • Change call type: audio_speechpass_through_endpoint
  • Rename moderations function: image_generationmoderation
  • Centralize proxy error handling; improve OpenAI error parsing/mapping
  • CI: keep only sofatutor_image.yml; move workflow

Files Changed

  • A .github/workflows/sofatutor_image.yml
  • M litellm/llms/openai/openai.py
  • M litellm/proxy/proxy_server.py

Notes

  • No API surface changes intended; naming updates affect logging/observability only
  • Audio streaming changes are backward compatible for sync callers

@mfittko mfittko self-assigned this May 19, 2025
…enAI exception mapping and parsing. Added functions to parse OpenAI error messages and handle proxy exceptions, improving error logging and response formatting.
…o "moderation" in the moderations endpoint, ensuring accurate logging and call type handling.
…through_endpoint" for improved logging and handling in the audio processing workflow.
…ing context managers, allowing for efficient byte iteration without buffering.
…g for efficient byte iteration without prematurely closing the upstream stream. This change enhances the async audio speech functionality while maintaining compatibility with existing synchronous behavior.
…ing context managers, allowing for efficient byte iteration without buffering.
…g for efficient byte iteration without prematurely closing the upstream stream. This change enhances the async audio speech functionality while maintaining compatibility with existing synchronous behavior.
…leaner test setup in TTS deferred streaming tests
…ehavior and ensure proper streaming iteration
…ction for speech calls and add unit tests for verification
…ng alerts and implement unit test for missing webhook scenario
- Remove unused 'import openai' from cloud_watch.py
- Remove test_assistants_logging test
- Update docs to remove Assistants API mentions
@mfittko mfittko changed the base branch from feature/cloudwatch-assistants-logging to main November 25, 2025 16:32
@mfittko mfittko changed the base branch from main to v1.80.0-stable November 25, 2025 16:36
mfittko and others added 5 commits November 25, 2025 17:41
…jects

When using the Responses API with a prompt object, OpenAI returns
the instructions field as a list of message objects (expanded from
the prompt template) rather than a string.

The OpenAI SDK correctly defines this as:
  instructions: Union[str, List[ResponseInputItem], None]

But LiteLLM's ResponsesAPIResponse had:
  instructions: Optional[str]

This caused a Pydantic ValidationError when streaming responses
tried to parse ResponseCreatedEvent because it expected a string
but received a list.

This fix updates the type to accept both formats:
  instructions: Optional[Union[str, ict[str, Any]]]]List

Added tests for:
- Non-streaming responses with instructions as list
- Non-streaming responses with instructions as string
- Streaming events (ResponseCreatedEvent, ResponseInProgressEvent,
  ResponseCompletedEvent) with instructions as list
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants