Releases: BerriAI/litellm
v1.58.2.dev1
What's Changed
- build(pyproject.toml): bump uvicorn depedency requirement + Azure o1 model check fix + Vertex Anthropic headers fix by @krrishdholakia in #7773
- Add
gemini/
frequency_penalty + presence_penalty support by @krrishdholakia in #7776 - feat(helm): add securityContext and pull policy values to migration job by @Hexoplon in #7652
- fix confusing save button label by @yujonglee in #7778
- [integrations/lunary] Improve Lunary documentaiton by @hughcrt in #7770
- Fix wrong URL for internal user invitation by @yujonglee in #7762
- Update instructor tutorial by @Winston-503 in #7784
- (helm) - allow specifying envVars on values.yaml + add helm lint test by @ishaan-jaff in #7789
- Fix anthropic pass-through end user tracking + add gemini-2.0-flash-thinking-exp by @krrishdholakia in #7772
- Add back in non root image fixes (#7781) by @krrishdholakia in #7795
- test: initial test to enforce all functions in user_api_key_auth.py h… by @krrishdholakia in #7797
- test: initial commit enforcing testing on all anthropic pass through … by @krrishdholakia in #7794
- build: bump certifi version - see if that fixes asyncio ssl issue on … by @krrishdholakia in #7800
New Contributors
- @Winston-503 made their first contribution in #7784
Full Changelog: v1.58.2...v1.58.2.dev1
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.58.2.dev1
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 250.0 | 279.05652153373586 | 6.093771343336731 | 0.0 | 1823 | 0 | 214.89994900002785 | 2653.5651230000212 |
Aggregated | Passed ✅ | 250.0 | 279.05652153373586 | 6.093771343336731 | 0.0 | 1823 | 0 | 214.89994900002785 | 2653.5651230000212 |
v1.58.2-dev2
What's Changed
- build(pyproject.toml): bump uvicorn depedency requirement + Azure o1 model check fix + Vertex Anthropic headers fix by @krrishdholakia in #7773
- Add
gemini/
frequency_penalty + presence_penalty support by @krrishdholakia in #7776 - feat(helm): add securityContext and pull policy values to migration job by @Hexoplon in #7652
- fix confusing save button label by @yujonglee in #7778
- [integrations/lunary] Improve Lunary documentaiton by @hughcrt in #7770
- Fix wrong URL for internal user invitation by @yujonglee in #7762
- Update instructor tutorial by @Winston-503 in #7784
New Contributors
- @Winston-503 made their first contribution in #7784
Full Changelog: v1.58.2...v1.58.2-dev2
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.58.2-dev2
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 200.0 | 215.31102581586012 | 6.152564490107213 | 0.0 | 1841 | 0 | 176.80144700000255 | 3405.0107850000018 |
Aggregated | Passed ✅ | 200.0 | 215.31102581586012 | 6.152564490107213 | 0.0 | 1841 | 0 | 176.80144700000255 | 3405.0107850000018 |
v1.58.2-dev1
What's Changed
- build(pyproject.toml): bump uvicorn depedency requirement + Azure o1 model check fix + Vertex Anthropic headers fix by @krrishdholakia in #7773
- Add
gemini/
frequency_penalty + presence_penalty support by @krrishdholakia in #7776 - Add back in non root image fixes by @rajatvig in #7781
Full Changelog: v1.58.2...v1.58.2-dev1
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.58.2-dev1
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Failed ❌ | 270.0 | 302.10681472219204 | 6.039948987754746 | 0.0 | 1807 | 0 | 228.94537999997056 | 4199.834433000035 |
Aggregated | Failed ❌ | 270.0 | 302.10681472219204 | 6.039948987754746 | 0.0 | 1807 | 0 | 228.94537999997056 | 4199.834433000035 |
v1.58.2
What's Changed
- Fix RPM/TPM limit typo in admin UI by @yujonglee in #7769
- Add AIM Guardrails support by @krrishdholakia in #7771
- Support temporary budget increases on keys by @krrishdholakia in #7754
- Litellm dev 01 13 2025 p2 by @krrishdholakia in #7758
- docs - iam role based access for bedrock by @ishaan-jaff in #7774
- (Feat) prometheus - emit remaining team budget metric on proxy startup by @ishaan-jaff in #7777
- (fix)
BaseAWSLLM
- cache IAM role credentials when used by @ishaan-jaff in #7775
Full Changelog: v1.58.1...v1.58.2
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.58.2
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 250.0 | 289.8090936126223 | 6.143711740946042 | 0.0 | 1838 | 0 | 228.12097899998207 | 2196.5017750000015 |
Aggregated | Passed ✅ | 250.0 | 289.8090936126223 | 6.143711740946042 | 0.0 | 1838 | 0 | 228.12097899998207 | 2196.5017750000015 |
v1.58.1
🚨Alpha - 1.58.0 has various perf improvements, we recommend waiting for a stable release before bumping in production
What's Changed
- (core sdk fix) - fix fallbacks stuck in infinite loop by @ishaan-jaff in #7751
- [Bug fix]: v1.58.0 - issue with read request body by @ishaan-jaff in #7753
- (litellm SDK perf improvements) - handle cases when unable to lookup model in model cost map by @ishaan-jaff in #7750
- (prometheus - minor bug fix) -
litellm_llm_api_time_to_first_token_metric
not populating for bedrock models by @ishaan-jaff in #7740 - (fix) health check - allow setting
health_check_model
by @ishaan-jaff in #7752
Full Changelog: v1.58.0...v1.58.1
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.58.1
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 250.0 | 294.2978673554448 | 6.045420383532543 | 0.0 | 1809 | 0 | 223.72276400000146 | 3539.4181890000027 |
Aggregated | Passed ✅ | 250.0 | 294.2978673554448 | 6.045420383532543 | 0.0 | 1809 | 0 | 223.72276400000146 | 3539.4181890000027 |
v1.58.0
v1.58.0 - Alpha Release
🚨 This is an alpha release - we've made several performance / RPS improvements to litellm core. If you see any issues please file it https://github.com/BerriAI/litellm/issues
What's Changed
- (proxy perf) - service logger don't always import OTEL in helper function by @ishaan-jaff in #7727
- (proxy perf) - only read request body 1 time per request by @ishaan-jaff in #7728
Full Changelog: v1.57.11...v1.58.0
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.58.0
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 240.0 | 273.2166563012582 | 6.118315985413586 | 0.0033451700302972037 | 1829 | 1 | 75.1692759999969 | 3821.228761000043 |
Aggregated | Passed ✅ | 240.0 | 273.2166563012582 | 6.118315985413586 | 0.0033451700302972037 | 1829 | 1 | 75.1692759999969 | 3821.228761000043 |
v1.57.11
v1.57.11 - Alpha Release
🚨 This is an alpha release - we've made several performance / RPS improvements to litellm core. If you see any issues please file it https://github.com/BerriAI/litellm/issues
What's Changed
- (litellm SDK perf improvement) - use
verbose_logger.debug
and_cached_get_model_info_helper
in_response_cost_calculator
by @ishaan-jaff in #7720 - (litellm sdk speedup) - use
_model_contains_known_llm_provider
inresponse_cost_calculator
to check if the model contains a known litellm provider by @ishaan-jaff in #7721 - (proxy perf) - only parse request body 1 time per request by @ishaan-jaff in #7722
- Revert "(proxy perf) - only parse request body 1 time per request" by @ishaan-jaff in #7724
- add azure o1 pricing by @krrishdholakia in #7715
Full Changelog: v1.57.10...v1.57.11
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.57.11
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 240.0 | 270.55759577820237 | 6.130862160194138 | 0.0 | 1835 | 0 | 224.79750500002638 | 1207.8732939999952 |
Aggregated | Passed ✅ | 240.0 | 270.55759577820237 | 6.130862160194138 | 0.0 | 1835 | 0 | 224.79750500002638 | 1207.8732939999952 |
v1.57.8-stable
Full Changelog: v1.57.8...v1.57.8-stable
🚨 Not stable - we've got alerts about bugs on text-embedding-3
. Identifying the root cause
✅ Resolved - this was not a litellm issue, it was cause because dd-trace run was patching the OpenAI SDK DataDog/dd-trace-py#11994
You are safe to upgrade to this version if you do not use dd-trace-run
in front of litellm
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:litellm_stable_release_branch-v1.57.8-stable
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 240.0 | 271.08706884006597 | 6.1244865014274685 | 0.0 | 1832 | 0 | 221.9753340000068 | 2009.652516000017 |
Aggregated | Passed ✅ | 240.0 | 271.08706884006597 | 6.1244865014274685 | 0.0 | 1832 | 0 | 221.9753340000068 | 2009.652516000017 |
v1.57.10
v1.57.10 - Alpha Release
🚨 This is an alpha release - we've made several performance / RPS improvements to litellm core. If you see any issues please file it https://github.com/BerriAI/litellm/issues
- Litellm dev 01 10 2025 p2 by @krrishdholakia in #7679
- Litellm dev 01 10 2025 p3 by @krrishdholakia in #7682
- build: new ui build by @krrishdholakia in #7685
- fix(model_hub.tsx): clarify cost in model hub is per 1m tokens by @krrishdholakia in #7687
- Litellm dev 01 11 2025 p3 by @krrishdholakia in #7702
- (perf litellm) - use
_get_model_info_helper
for cost tracking by @ishaan-jaff in #7703 - (perf sdk) - minor changes to cost calculator to run helpers only when necessary by @ishaan-jaff in #7704
- (perf) - proxy, use
orjson
for reading request body by @ishaan-jaff in #7706 - (minor fix -
aiohttp_openai/
) - fix get_custom_llm_provider by @ishaan-jaff in #7705 - (sdk perf fix) - only print args passed to litellm when debugging mode is on by @ishaan-jaff in #7708
- (perf) - only use response_cost_calculator 1 time per request. (Don't re-use the same helper twice per call ) by @ishaan-jaff in #7709
- [BETA] Add OpenAI
/images/variations
+ Topaz API support by @krrishdholakia in #7700 - (litellm sdk speedup router) - adds a helper
_cached_get_model_group_info
to use when trying to get deployment tpm/rpm limits by @ishaan-jaff in #7719
Full Changelog: v1.57.8...v1.57.10
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.57.10
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 240.0 | 264.0629029362514 | 6.184926091214754 | 0.0 | 1851 | 0 | 213.62108399998192 | 1622.618584999998 |
Aggregated | Passed ✅ | 240.0 | 264.0629029362514 | 6.184926091214754 | 0.0 | 1851 | 0 | 213.62108399998192 | 1622.618584999998 |
v1.57.8
What's Changed
- (proxy latency/perf fix - user_api_key_auth) - use asyncio.create task for caching virtual key once it's validated by @ishaan-jaff in #7676
- (litellm sdk - perf improvement) - optimize
response_cost_calculator
by @ishaan-jaff in #7674 - (litellm sdk - perf improvement) - use O(1) set lookups for checking llm providers / models by @ishaan-jaff in #7672
- (litellm sdk - perf improvement) - optimize
pre_call_check
by @ishaan-jaff in #7673 - [integrations/lunary] allow to pass custom parent run id to LLM calls by @hughcrt in #7651
- LiteLLM Minor Fixes & Improvements (01/10/2025) - p1 by @krrishdholakia in #7670
- (performance improvement - litellm sdk + proxy) - ensure litellm does not create unnecessary threads when running async functions by @ishaan-jaff in #7680
- (litellm proxy perf) - pass num_workers cli arg to uvicorn when
num_workers
is specified by @ishaan-jaff in #7681 - fix proxy pre call hook - only use
asyncio.create_task
if user opts into alerting by @ishaan-jaff in #7683 - [Bug fix]: Proxy Auth Layer - Allow Azure Realtime routes as llm_api_routes by @ishaan-jaff in #7684
Full Changelog: v1.57.7...v1.57.8
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.57.8
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 210.0 | 225.29799695056985 | 6.153370698253471 | 0.0 | 1841 | 0 | 177.73327700001573 | 2088.13791099999 |
Aggregated | Passed ✅ | 210.0 | 225.29799695056985 | 6.153370698253471 | 0.0 | 1841 | 0 | 177.73327700001573 | 2088.13791099999 |