Releases: BerriAI/litellm
v1.50.1
What's Changed
- doc - using gpt-4o-audio-preview by @ishaan-jaff in #6326
- (refactor)
get_cache_key
to be under 100 LOC function by @ishaan-jaff in #6327 - Litellm openai audio streaming by @krrishdholakia in #6325
- LiteLLM Minor Fixes & Improvements (10/18/2024) by @krrishdholakia in #6320
- LiteLLM Minor Fixes & Improvements (10/19/2024) by @krrishdholakia in #6331
- fix - unhandled jsonDecodeError in
convert_to_model_response_object
by @ishaan-jaff in #6338 - (testing) add test coverage for init custom logger class by @ishaan-jaff in #6341
Full Changelog: v1.50.0...v1.50.1
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.50.1
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 260.0 | 288.9506471715694 | 6.1364168904754175 | 0.0 | 1836 | 0 | 231.4412910000101 | 1825.7555540000112 |
Aggregated | Passed ✅ | 260.0 | 288.9506471715694 | 6.1364168904754175 | 0.0 | 1836 | 0 | 231.4412910000101 | 1825.7555540000112 |
v1.50.0-stable
What's Changed
- (feat) add
gpt-4o-audio-preview
models to model cost map by @ishaan-jaff in #6306 - (code quality) add ruff check PLR0915 for
too-many-statements
by @ishaan-jaff in #6309 - (doc) fix typo on Turn on / off caching per Key. by @ishaan-jaff in #6297
- (feat) Support
audio
,modalities
params by @ishaan-jaff in #6304 - (feat) Support audio param in responses streaming by @ishaan-jaff in #6312
- (feat) - allow using os.environ/ vars for any value on config.yaml by @ishaan-jaff in #6276
Full Changelog: v1.49.7...v1.50.0-stable
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.50.0-stable
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 250.0 | 280.1783744989076 | 6.121418649325928 | 0.0 | 1832 | 0 | 224.80250699993576 | 1589.2013160000715 |
Aggregated | Passed ✅ | 250.0 | 280.1783744989076 | 6.121418649325928 | 0.0 | 1832 | 0 | 224.80250699993576 | 1589.2013160000715 |
v1.50.0
What's Changed
- (feat) add
gpt-4o-audio-preview
models to model cost map by @ishaan-jaff in #6306 - (code quality) add ruff check PLR0915 for
too-many-statements
by @ishaan-jaff in #6309 - (doc) fix typo on Turn on / off caching per Key. by @ishaan-jaff in #6297
- (feat) Support
audio
,modalities
params by @ishaan-jaff in #6304 - (feat) Support audio param in responses streaming by @ishaan-jaff in #6312
- (feat) - allow using os.environ/ vars for any value on config.yaml by @ishaan-jaff in #6276
Full Changelog: v1.49.7...v1.50.0
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.50.0
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 240.0 | 266.05337712404867 | 6.142852534799847 | 0.0 | 1838 | 0 | 211.22095199996238 | 1541.6589870000053 |
Aggregated | Passed ✅ | 240.0 | 266.05337712404867 | 6.142852534799847 | 0.0 | 1838 | 0 | 211.22095199996238 | 1541.6589870000053 |
v1.49.7-stable
What's Changed
- Revert "(perf) move s3 logging to Batch logging + async [94% faster p… by @ishaan-jaff in https://github.com//pull/6275
- (testing) add test coverage for LLM OTEL logging by @ishaan-jaff in #6227
- (testing) add unit tests for LLMCachingHandler Class by @ishaan-jaff in #6279
- LiteLLM Minor Fixes & Improvements (10/17/2024) by @krrishdholakia in #6293
Full Changelog: v1.49.6...v1.49.7-stable
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.49.7-stable
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 250.0 | 275.82870433443276 | 6.089150330248114 | 0.0 | 1821 | 0 | 224.8554669999976 | 1500.5543909999801 |
Aggregated | Passed ✅ | 250.0 | 275.82870433443276 | 6.089150330248114 | 0.0 | 1821 | 0 | 224.8554669999976 | 1500.5543909999801 |
v1.49.7
What's Changed
- Revert "(perf) move s3 logging to Batch logging + async [94% faster p… by @ishaan-jaff in https://github.com//pull/6275
- (testing) add test coverage for LLM OTEL logging by @ishaan-jaff in #6227
- (testing) add unit tests for LLMCachingHandler Class by @ishaan-jaff in #6279
- LiteLLM Minor Fixes & Improvements (10/17/2024) by @krrishdholakia in #6293
Full Changelog: v1.49.6...v1.49.7
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.49.7
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 200.0 | 217.72068803611796 | 6.198611902745536 | 0.0 | 1855 | 0 | 176.8321219999507 | 1433.260539999992 |
Aggregated | Passed ✅ | 200.0 | 217.72068803611796 | 6.198611902745536 | 0.0 | 1855 | 0 | 176.8321219999507 | 1433.260539999992 |
v1.49.6-stable
What's Changed
- (router testing) Add testing coverage for
run_async_fallback
andrun_sync_fallback
by @ishaan-jaff in #6256 - LiteLLM Minor Fixes & Improvements (10/15/2024) by @krrishdholakia in #6242
- (testing) Router add testing coverage by @ishaan-jaff in #6253
- (testing) add router unit testing for
send_llm_exception_alert
,router_cooldown_event_callback
, cooldown utils by @ishaan-jaff in #6258 - Litellm router code coverage 3 by @krrishdholakia in #6274
- Remove "ask mode" from Canary search by @yujonglee in #6271
- LiteLLM Minor Fixes & Improvements (10/16/2024) by @krrishdholakia in #6265
Full Changelog: v1.49.5...v1.49.6-stable
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.49.6-stable
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Failed ❌ | 280.0 | 308.62173755687854 | 6.168234186408995 | 0.0 | 1846 | 0 | 209.20113499994386 | 2605.53480599998 |
Aggregated | Failed ❌ | 280.0 | 308.62173755687854 | 6.168234186408995 | 0.0 | 1846 | 0 | 209.20113499994386 | 2605.53480599998 |
v1.49.6
What's Changed
- (router testing) Add testing coverage for
run_async_fallback
andrun_sync_fallback
by @ishaan-jaff in #6256 - LiteLLM Minor Fixes & Improvements (10/15/2024) by @krrishdholakia in #6242
- (testing) Router add testing coverage by @ishaan-jaff in #6253
- (testing) add router unit testing for
send_llm_exception_alert
,router_cooldown_event_callback
, cooldown utils by @ishaan-jaff in #6258 - Litellm router code coverage 3 by @krrishdholakia in #6274
- Remove "ask mode" from Canary search by @yujonglee in #6271
- LiteLLM Minor Fixes & Improvements (10/16/2024) by @krrishdholakia in #6265
Full Changelog: v1.49.5...v1.49.6
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.49.6
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 250.0 | 286.97499258450677 | 6.169343856034057 | 0.0 | 1846 | 0 | 229.1845549999607 | 2911.2547339999537 |
Aggregated | Passed ✅ | 250.0 | 286.97499258450677 | 6.169343856034057 | 0.0 | 1846 | 0 | 229.1845549999607 | 2911.2547339999537 |
v1.49.5
What's Changed
- (fix) prompt caching cost calculation OpenAI, Azure OpenAI by @ishaan-jaff in #6231
- (fix) arize handle optional params by @ishaan-jaff in #6243
- Bump hono from 4.5.8 to 4.6.5 in /litellm-js/spend-logs by @dependabot in #6245
- (refactor) caching - use _sync_set_cache by @ishaan-jaff in #6224
- Make meta in rerank API Response optional - Compatible with Opensource APIs by @ishaan-jaff in #6248
- (testing - litellm.Router ) add unit test coverage for pattern matching / wildcard routing by @ishaan-jaff in #6250
- (refactor) sync caching - use
LLMCachingHandler
class for get_cache by @ishaan-jaff in #6249 - (refactor) - caching use separate files for each cache class by @ishaan-jaff in #6251
Full Changelog: v1.49.4...v1.49.5
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.49.5
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 250.0 | 286.11866113691053 | 6.100923329145553 | 0.0 | 1826 | 0 | 224.7632039999985 | 2036.4872069999365 |
Aggregated | Passed ✅ | 250.0 | 286.11866113691053 | 6.100923329145553 | 0.0 | 1826 | 0 | 224.7632039999985 | 2036.4872069999365 |
v1.49.4
What's Changed
- (refactor router.py ) - PR 3 - Ensure all functions under 100 lines by @ishaan-jaff in #6181
- [Bug Fix]: fix litellm.caching imports on python SDK by @ishaan-jaff in #6219
- LiteLLM Minor Fixes & Improvements (10/14/2024) by @krrishdholakia in #6221
- test(router_code_coverage.py): check if all router functions are dire… by @krrishdholakia in #6186
- (refactor) use helper function
_assemble_complete_response_from_streaming_chunks
to assemble complete responses in caching and logging callbacks by @ishaan-jaff in #6220 - (refactor) OTEL - use safe_set_attribute for setting attributes by @ishaan-jaff in #6226
Full Changelog: v1.49.3...v1.49.4
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.49.4
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 200.0 | 212.5333893868387 | 6.244178319118513 | 0.0 | 1869 | 0 | 178.2565319999776 | 1357.8999799999565 |
Aggregated | Passed ✅ | 200.0 | 212.5333893868387 | 6.244178319118513 | 0.0 | 1869 | 0 | 178.2565319999776 | 1357.8999799999565 |
v1.49.3
What's Changed
- Litellm Minor Fixes & Improvements (10/12/2024) by @krrishdholakia in #6179
- build(config.yml): add codecov to repo by @krrishdholakia in #6172
- ci(config.yml): add local_testing tests to codecov coverage check by @krrishdholakia in #6183
- ci(config.yml): add further testing coverage to codecov by @krrishdholakia in #6184
- docs(configs.md): document all environment variables by @krrishdholakia in #6185
- (feat) add components to codecov yml by @ishaan-jaff in #6207
- (refactor) caching use LLMCachingHandler for async_get_cache and set_cache by @ishaan-jaff in #6208
- (feat) prometheus have well defined latency buckets by @ishaan-jaff in #6211
- (refactor caching) use LLMCachingHandler for caching streaming responses by @ishaan-jaff in #6210
- bump @getcanary/web@1.0.9 by @yujonglee in #6187
- (refactor caching) use common
_retrieve_from_cache
helper by @ishaan-jaff in #6212
Full Changelog: v1.49.2...v1.49.3
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.49.3
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Failed ❌ | 270.0 | 313.92829846432517 | 5.997591789487422 | 0.0 | 1794 | 0 | 231.2673659999973 | 3079.4730589999517 |
Aggregated | Failed ❌ | 270.0 | 313.92829846432517 | 5.997591789487422 | 0.0 | 1794 | 0 | 231.2673659999973 | 3079.4730589999517 |