Skip to content

Releases: BerriAI/litellm

v1.58.2-dev1

15 Jan 16:39
d345a76
Compare
Choose a tag to compare

What's Changed

  • build(pyproject.toml): bump uvicorn depedency requirement + Azure o1 model check fix + Vertex Anthropic headers fix by @krrishdholakia in #7773
  • Add gemini/ frequency_penalty + presence_penalty support by @krrishdholakia in #7776
  • Add back in non root image fixes by @rajatvig in #7781

Full Changelog: v1.58.2...v1.58.2-dev1

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.58.2-dev1

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Failed ❌ 270.0 302.10681472219204 6.039948987754746 0.0 1807 0 228.94537999997056 4199.834433000035
Aggregated Failed ❌ 270.0 302.10681472219204 6.039948987754746 0.0 1807 0 228.94537999997056 4199.834433000035

v1.58.2

15 Jan 06:09
Compare
Choose a tag to compare

What's Changed

Full Changelog: v1.58.1...v1.58.2

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.58.2

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 250.0 289.8090936126223 6.143711740946042 0.0 1838 0 228.12097899998207 2196.5017750000015
Aggregated Passed ✅ 250.0 289.8090936126223 6.143711740946042 0.0 1838 0 228.12097899998207 2196.5017750000015

v1.58.1

14 Jan 05:55
Compare
Choose a tag to compare

🚨Alpha - 1.58.0 has various perf improvements, we recommend waiting for a stable release before bumping in production

What's Changed

  • (core sdk fix) - fix fallbacks stuck in infinite loop by @ishaan-jaff in #7751
  • [Bug fix]: v1.58.0 - issue with read request body by @ishaan-jaff in #7753
  • (litellm SDK perf improvements) - handle cases when unable to lookup model in model cost map by @ishaan-jaff in #7750
  • (prometheus - minor bug fix) - litellm_llm_api_time_to_first_token_metric not populating for bedrock models by @ishaan-jaff in #7740
  • (fix) health check - allow setting health_check_model by @ishaan-jaff in #7752

Full Changelog: v1.58.0...v1.58.1

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.58.1

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 250.0 294.2978673554448 6.045420383532543 0.0 1809 0 223.72276400000146 3539.4181890000027
Aggregated Passed ✅ 250.0 294.2978673554448 6.045420383532543 0.0 1809 0 223.72276400000146 3539.4181890000027

v1.58.0

13 Jan 08:01
Compare
Choose a tag to compare

v1.58.0 - Alpha Release

🚨 This is an alpha release - we've made several performance / RPS improvements to litellm core. If you see any issues please file it https://github.com/BerriAI/litellm/issues

What's Changed

  • (proxy perf) - service logger don't always import OTEL in helper function by @ishaan-jaff in #7727
  • (proxy perf) - only read request body 1 time per request by @ishaan-jaff in #7728

Full Changelog: v1.57.11...v1.58.0

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.58.0

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 240.0 273.2166563012582 6.118315985413586 0.0033451700302972037 1829 1 75.1692759999969 3821.228761000043
Aggregated Passed ✅ 240.0 273.2166563012582 6.118315985413586 0.0033451700302972037 1829 1 75.1692759999969 3821.228761000043

v1.57.11

13 Jan 06:52
Compare
Choose a tag to compare

v1.57.11 - Alpha Release

🚨 This is an alpha release - we've made several performance / RPS improvements to litellm core. If you see any issues please file it https://github.com/BerriAI/litellm/issues

What's Changed

  • (litellm SDK perf improvement) - use verbose_logger.debug and _cached_get_model_info_helper in _response_cost_calculator by @ishaan-jaff in #7720
  • (litellm sdk speedup) - use _model_contains_known_llm_provider in response_cost_calculator to check if the model contains a known litellm provider by @ishaan-jaff in #7721
  • (proxy perf) - only parse request body 1 time per request by @ishaan-jaff in #7722
  • Revert "(proxy perf) - only parse request body 1 time per request" by @ishaan-jaff in #7724
  • add azure o1 pricing by @krrishdholakia in #7715

Full Changelog: v1.57.10...v1.57.11

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.57.11

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 240.0 270.55759577820237 6.130862160194138 0.0 1835 0 224.79750500002638 1207.8732939999952
Aggregated Passed ✅ 240.0 270.55759577820237 6.130862160194138 0.0 1835 0 224.79750500002638 1207.8732939999952

v1.57.8-stable

13 Jan 07:32
Compare
Choose a tag to compare

Full Changelog: v1.57.8...v1.57.8-stable

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:litellm_stable_release_branch-v1.57.8-stable

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 240.0 271.08706884006597 6.1244865014274685 0.0 1832 0 221.9753340000068 2009.652516000017
Aggregated Passed ✅ 240.0 271.08706884006597 6.1244865014274685 0.0 1832 0 221.9753340000068 2009.652516000017

v1.57.10

13 Jan 00:29
15b5203
Compare
Choose a tag to compare

v1.57.10 - Alpha Release

🚨 This is an alpha release - we've made several performance / RPS improvements to litellm core. If you see any issues please file it https://github.com/BerriAI/litellm/issues

Full Changelog: v1.57.8...v1.57.10

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.57.10

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 240.0 264.0629029362514 6.184926091214754 0.0 1851 0 213.62108399998192 1622.618584999998
Aggregated Passed ✅ 240.0 264.0629029362514 6.184926091214754 0.0 1851 0 213.62108399998192 1622.618584999998

v1.57.8

11 Jan 06:31
Compare
Choose a tag to compare

What's Changed

  • (proxy latency/perf fix - user_api_key_auth) - use asyncio.create task for caching virtual key once it's validated by @ishaan-jaff in #7676
  • (litellm sdk - perf improvement) - optimize response_cost_calculator by @ishaan-jaff in #7674
  • (litellm sdk - perf improvement) - use O(1) set lookups for checking llm providers / models by @ishaan-jaff in #7672
  • (litellm sdk - perf improvement) - optimize pre_call_check by @ishaan-jaff in #7673
  • [integrations/lunary] allow to pass custom parent run id to LLM calls by @hughcrt in #7651
  • LiteLLM Minor Fixes & Improvements (01/10/2025) - p1 by @krrishdholakia in #7670
  • (performance improvement - litellm sdk + proxy) - ensure litellm does not create unnecessary threads when running async functions by @ishaan-jaff in #7680
  • (litellm proxy perf) - pass num_workers cli arg to uvicorn when num_workers is specified by @ishaan-jaff in #7681
  • fix proxy pre call hook - only use asyncio.create_task if user opts into alerting by @ishaan-jaff in #7683
  • [Bug fix]: Proxy Auth Layer - Allow Azure Realtime routes as llm_api_routes by @ishaan-jaff in #7684

Full Changelog: v1.57.7...v1.57.8

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.57.8

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 210.0 225.29799695056985 6.153370698253471 0.0 1841 0 177.73327700001573 2088.13791099999
Aggregated Passed ✅ 210.0 225.29799695056985 6.153370698253471 0.0 1841 0 177.73327700001573 2088.13791099999

v1.57.7

10 Jan 23:40
Compare
Choose a tag to compare

What's Changed

  • (minor latency fixes / proxy) - use verbose_proxy_logger.debug() instead of litellm.print_verbose by @ishaan-jaff in #7664
  • feat(ui_sso.py): Allows users to use test key pane, and have team budget limits be enforced for their use-case by @krrishdholakia in #7666
  • fix(main.py): fix lm_studio/ embedding routing by @krrishdholakia in #7658
  • fix(vertex_ai/gemini/transformation.py): handle 'http://' in gemini p… by @krrishdholakia in #7660
  • Use environment variable for Athina logging URL by @vivek-athina in #7628

Full Changelog: v1.57.5...v1.57.7

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.57.7

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 200.0 218.4749677188173 6.216185012755876 0.0 1860 0 177.92223199990076 3911.6109139999935
Aggregated Passed ✅ 200.0 218.4749677188173 6.216185012755876 0.0 1860 0 177.92223199990076 3911.6109139999935

v1.57.5

10 Jan 05:47
Compare
Choose a tag to compare

🚨🚨 Known issue - do not upgrade - Window's compatibility issue on this release

Relevant issue: #7677

What's Changed

Full Changelog: v1.57.4...v1.57.5

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.57.5

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 230.0 282.70225500655766 6.115771768544881 0.0 1830 0 206.44150200001832 3375.4479410000044
Aggregated Passed ✅ 230.0 282.70225500655766 6.115771768544881 0.0 1830 0 206.44150200001832 3375.4479410000044