Skip to content

Releases: BerriAI/litellm

v1.58.2.dev1

16 Jan 20:53
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v1.58.2...v1.58.2.dev1

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.58.2.dev1

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 250.0 279.05652153373586 6.093771343336731 0.0 1823 0 214.89994900002785 2653.5651230000212
Aggregated Passed ✅ 250.0 279.05652153373586 6.093771343336731 0.0 1823 0 214.89994900002785 2653.5651230000212

v1.58.2-dev2

16 Jan 03:32
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v1.58.2...v1.58.2-dev2

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.58.2-dev2

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 200.0 215.31102581586012 6.152564490107213 0.0 1841 0 176.80144700000255 3405.0107850000018
Aggregated Passed ✅ 200.0 215.31102581586012 6.152564490107213 0.0 1841 0 176.80144700000255 3405.0107850000018

v1.58.2-dev1

15 Jan 16:39
d345a76
Compare
Choose a tag to compare

What's Changed

  • build(pyproject.toml): bump uvicorn depedency requirement + Azure o1 model check fix + Vertex Anthropic headers fix by @krrishdholakia in #7773
  • Add gemini/ frequency_penalty + presence_penalty support by @krrishdholakia in #7776
  • Add back in non root image fixes by @rajatvig in #7781

Full Changelog: v1.58.2...v1.58.2-dev1

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.58.2-dev1

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Failed ❌ 270.0 302.10681472219204 6.039948987754746 0.0 1807 0 228.94537999997056 4199.834433000035
Aggregated Failed ❌ 270.0 302.10681472219204 6.039948987754746 0.0 1807 0 228.94537999997056 4199.834433000035

v1.58.2

15 Jan 06:09
Compare
Choose a tag to compare

What's Changed

Full Changelog: v1.58.1...v1.58.2

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.58.2

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 250.0 289.8090936126223 6.143711740946042 0.0 1838 0 228.12097899998207 2196.5017750000015
Aggregated Passed ✅ 250.0 289.8090936126223 6.143711740946042 0.0 1838 0 228.12097899998207 2196.5017750000015

v1.58.1

14 Jan 05:55
Compare
Choose a tag to compare

🚨Alpha - 1.58.0 has various perf improvements, we recommend waiting for a stable release before bumping in production

What's Changed

  • (core sdk fix) - fix fallbacks stuck in infinite loop by @ishaan-jaff in #7751
  • [Bug fix]: v1.58.0 - issue with read request body by @ishaan-jaff in #7753
  • (litellm SDK perf improvements) - handle cases when unable to lookup model in model cost map by @ishaan-jaff in #7750
  • (prometheus - minor bug fix) - litellm_llm_api_time_to_first_token_metric not populating for bedrock models by @ishaan-jaff in #7740
  • (fix) health check - allow setting health_check_model by @ishaan-jaff in #7752

Full Changelog: v1.58.0...v1.58.1

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.58.1

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 250.0 294.2978673554448 6.045420383532543 0.0 1809 0 223.72276400000146 3539.4181890000027
Aggregated Passed ✅ 250.0 294.2978673554448 6.045420383532543 0.0 1809 0 223.72276400000146 3539.4181890000027

v1.58.0

13 Jan 08:01
Compare
Choose a tag to compare

v1.58.0 - Alpha Release

🚨 This is an alpha release - we've made several performance / RPS improvements to litellm core. If you see any issues please file it https://github.com/BerriAI/litellm/issues

What's Changed

  • (proxy perf) - service logger don't always import OTEL in helper function by @ishaan-jaff in #7727
  • (proxy perf) - only read request body 1 time per request by @ishaan-jaff in #7728

Full Changelog: v1.57.11...v1.58.0

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.58.0

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 240.0 273.2166563012582 6.118315985413586 0.0033451700302972037 1829 1 75.1692759999969 3821.228761000043
Aggregated Passed ✅ 240.0 273.2166563012582 6.118315985413586 0.0033451700302972037 1829 1 75.1692759999969 3821.228761000043

v1.57.11

13 Jan 06:52
Compare
Choose a tag to compare

v1.57.11 - Alpha Release

🚨 This is an alpha release - we've made several performance / RPS improvements to litellm core. If you see any issues please file it https://github.com/BerriAI/litellm/issues

What's Changed

  • (litellm SDK perf improvement) - use verbose_logger.debug and _cached_get_model_info_helper in _response_cost_calculator by @ishaan-jaff in #7720
  • (litellm sdk speedup) - use _model_contains_known_llm_provider in response_cost_calculator to check if the model contains a known litellm provider by @ishaan-jaff in #7721
  • (proxy perf) - only parse request body 1 time per request by @ishaan-jaff in #7722
  • Revert "(proxy perf) - only parse request body 1 time per request" by @ishaan-jaff in #7724
  • add azure o1 pricing by @krrishdholakia in #7715

Full Changelog: v1.57.10...v1.57.11

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.57.11

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 240.0 270.55759577820237 6.130862160194138 0.0 1835 0 224.79750500002638 1207.8732939999952
Aggregated Passed ✅ 240.0 270.55759577820237 6.130862160194138 0.0 1835 0 224.79750500002638 1207.8732939999952

v1.57.8-stable

13 Jan 07:32
Compare
Choose a tag to compare

Full Changelog: v1.57.8...v1.57.8-stable

🚨 Not stable - we've got alerts about bugs on text-embedding-3. Identifying the root cause
✅ Resolved - this was not a litellm issue, it was cause because dd-trace run was patching the OpenAI SDK DataDog/dd-trace-py#11994

You are safe to upgrade to this version if you do not use dd-trace-run in front of litellm

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:litellm_stable_release_branch-v1.57.8-stable

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 240.0 271.08706884006597 6.1244865014274685 0.0 1832 0 221.9753340000068 2009.652516000017
Aggregated Passed ✅ 240.0 271.08706884006597 6.1244865014274685 0.0 1832 0 221.9753340000068 2009.652516000017

v1.57.10

13 Jan 00:29
15b5203
Compare
Choose a tag to compare

v1.57.10 - Alpha Release

🚨 This is an alpha release - we've made several performance / RPS improvements to litellm core. If you see any issues please file it https://github.com/BerriAI/litellm/issues

Full Changelog: v1.57.8...v1.57.10

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.57.10

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 240.0 264.0629029362514 6.184926091214754 0.0 1851 0 213.62108399998192 1622.618584999998
Aggregated Passed ✅ 240.0 264.0629029362514 6.184926091214754 0.0 1851 0 213.62108399998192 1622.618584999998

v1.57.8

11 Jan 06:31
Compare
Choose a tag to compare

What's Changed

  • (proxy latency/perf fix - user_api_key_auth) - use asyncio.create task for caching virtual key once it's validated by @ishaan-jaff in #7676
  • (litellm sdk - perf improvement) - optimize response_cost_calculator by @ishaan-jaff in #7674
  • (litellm sdk - perf improvement) - use O(1) set lookups for checking llm providers / models by @ishaan-jaff in #7672
  • (litellm sdk - perf improvement) - optimize pre_call_check by @ishaan-jaff in #7673
  • [integrations/lunary] allow to pass custom parent run id to LLM calls by @hughcrt in #7651
  • LiteLLM Minor Fixes & Improvements (01/10/2025) - p1 by @krrishdholakia in #7670
  • (performance improvement - litellm sdk + proxy) - ensure litellm does not create unnecessary threads when running async functions by @ishaan-jaff in #7680
  • (litellm proxy perf) - pass num_workers cli arg to uvicorn when num_workers is specified by @ishaan-jaff in #7681
  • fix proxy pre call hook - only use asyncio.create_task if user opts into alerting by @ishaan-jaff in #7683
  • [Bug fix]: Proxy Auth Layer - Allow Azure Realtime routes as llm_api_routes by @ishaan-jaff in #7684

Full Changelog: v1.57.7...v1.57.8

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.57.8

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 210.0 225.29799695056985 6.153370698253471 0.0 1841 0 177.73327700001573 2088.13791099999
Aggregated Passed ✅ 210.0 225.29799695056985 6.153370698253471 0.0 1841 0 177.73327700001573 2088.13791099999