Skip to content

Conversation

constantinius
Copy link
Contributor

@constantinius constantinius commented Sep 25, 2025

Add a first implementation of the litellm integration, supporting completion and embeddings

Closes https://linear.app/getsentry/issue/PY-1828/add-agent-monitoring-support-for-litellm
Closes https://linear.app/getsentry/issue/TET-1218/litellm-testing


Note

Introduce LiteLLMIntegration that instruments LiteLLM chat/embeddings calls with spans, token usage, optional prompt logging, and exception capture.

  • Integrations:
    • Add sentry_sdk/integrations/litellm.py with LiteLLMIntegration registering LiteLLM input/success/failure callbacks.
    • Start spans for chat/embeddings, set gen_ai.* metadata (provider/system, operation, model, params like max_tokens, temperature, top_p, stream).
    • Record LiteLLM-specific fields: api_base, api_version, custom_llm_provider.
    • Optionally capture request/response messages when include_prompts and PII are enabled.
    • Track token usage from response usage and capture exceptions; always finish spans.

Written by Cursor Bugbot for commit 1ecd559. This will update automatically on new commits. Configure here.

@constantinius constantinius requested a review from a team as a code owner September 25, 2025 13:28
Copy link

linear bot commented Sep 25, 2025

cursor[bot]

This comment was marked as outdated.

Comment on lines +275 to +281
litellm.success_callback = litellm.success_callback or []
if _success_callback not in litellm.success_callback:
litellm.success_callback.append(_success_callback)

litellm.failure_callback = litellm.failure_callback or []
if _failure_callback not in litellm.failure_callback:
litellm.failure_callback.append(_failure_callback)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems as if both success_callback and failure_callback are run in a thread, which might finish after completion returns. As the span is closed in either callback, it may occur that the span is finished after the surrounding transaction does, resulting it being absent completely. This should definitely be pointed out somewhere.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is definitely the potential for a timing issue but I don't see a way around it at the moment since the LiteLLM integration might not be in control of the overarching transaction.

From your testing when developing this, was this a real issue when something like a web framework was managing the transaction?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was only an issue when writing code like this:

with sentry_sdk.start_transaction(...):
    result = completion(...)

When using it in a framework (tried with FastAPI) I could not reproduce this error.

Comment on lines +103 to +111
params = {
"model": SPANDATA.GEN_AI_REQUEST_MODEL,
"stream": SPANDATA.GEN_AI_RESPONSE_STREAMING,
"max_tokens": SPANDATA.GEN_AI_REQUEST_MAX_TOKENS,
"presence_penalty": SPANDATA.GEN_AI_REQUEST_PRESENCE_PENALTY,
"frequency_penalty": SPANDATA.GEN_AI_REQUEST_FREQUENCY_PENALTY,
"temperature": SPANDATA.GEN_AI_REQUEST_TEMPERATURE,
"top_p": SPANDATA.GEN_AI_REQUEST_TOP_P,
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not clear where to actually put these parameters in the arguments to completion

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't understand this comment, can you elaborate? What do the params have to do with completion?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

completion takes quite generic kwargs that are then passed on to the model provider API. These above are used for openai (at least I suspect that this is where and how we retrieve them)

cursor[bot]

This comment was marked as outdated.

Copy link
Contributor

@sentrivana sentrivana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very clean!

Left a few comments, please have a look. 🙏🏻

Comment on lines +103 to +111
params = {
"model": SPANDATA.GEN_AI_REQUEST_MODEL,
"stream": SPANDATA.GEN_AI_RESPONSE_STREAMING,
"max_tokens": SPANDATA.GEN_AI_REQUEST_MAX_TOKENS,
"presence_penalty": SPANDATA.GEN_AI_REQUEST_PRESENCE_PENALTY,
"frequency_penalty": SPANDATA.GEN_AI_REQUEST_FREQUENCY_PENALTY,
"temperature": SPANDATA.GEN_AI_REQUEST_TEMPERATURE,
"top_p": SPANDATA.GEN_AI_REQUEST_TOP_P,
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't understand this comment, can you elaborate? What do the params have to do with completion?

Comment on lines +275 to +281
litellm.success_callback = litellm.success_callback or []
if _success_callback not in litellm.success_callback:
litellm.success_callback.append(_success_callback)

litellm.failure_callback = litellm.failure_callback or []
if _failure_callback not in litellm.failure_callback:
litellm.failure_callback.append(_failure_callback)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is definitely the potential for a timing issue but I don't see a way around it at the moment since the LiteLLM integration might not be in control of the overarching transaction.

From your testing when developing this, was this a real issue when something like a web framework was managing the transaction?

cursor[bot]

This comment was marked as outdated.

cursor[bot]

This comment was marked as outdated.

@sentrivana
Copy link
Contributor

TY for incorporating my feedback, LGTM. From my POV there are 2 outstanding points:

LMK if you need support with any of these or if you want my feedback on anything else.

@constantinius
Copy link
Contributor Author

constantinius commented Oct 2, 2025

I've added tests and added the minimum versions to the list. I hope that works out.

  • If we want the SDK to have an litellm extra as mentioned in the docs PR, it needs to be added to setup.py

Done

cursor[bot]

This comment was marked as outdated.

Copy link

linear bot commented Oct 2, 2025

TET-1218 LiteLLM testing

Copy link
Contributor

@sentrivana sentrivana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took the liberty to add the missing test setup (the tests were not running in ci) and fix mypy's complaints, so if ci is green we should be good to go.

@constantinius constantinius merged commit f979abf into master Oct 3, 2025
113 checks passed
@constantinius constantinius deleted the constantinius/feat/integration/litellm branch October 3, 2025 09:48
constantinius added a commit to getsentry/sentry-docs that referenced this pull request Oct 3, 2025
<!-- Use this checklist to make sure your PR is ready for merge. You may
delete any sections you don't need. -->

## DESCRIBE YOUR PR

Add docs for the LiteLLM Python SDK integration.

Requires getsentry/sentry-python#4864

Closes https://linear.app/getsentry/issue/TET-1217/litellm-docs

## IS YOUR CHANGE URGENT?  

Help us prioritize incoming PRs by letting us know when the change needs
to go live.
- [ ] Urgent deadline (GA date, etc.): <!-- ENTER DATE HERE -->
- [ ] Other deadline: <!-- ENTER DATE HERE -->
- [x] None: Not urgent, can wait up to 1 week+

## SLA

- Teamwork makes the dream work, so please add a reviewer to your PRs.
- Please give the docs team up to 1 week to review your PR unless you've
added an urgent due date to it.
Thanks in advance for your help!

## PRE-MERGE CHECKLIST

*Make sure you've checked the following before merging your changes:*

- [ ] Checked Vercel preview for correctness, including links
- [ ] PR was reviewed and approved by any necessary SMEs (subject matter
experts)
- [ ] PR was reviewed and approved by a member of the [Sentry docs
team](https://github.com/orgs/getsentry/teams/docs)

## LEGAL BOILERPLATE

<!-- Sentry employees and contractors can delete or ignore this section.
-->

Look, I get it. The entity doing business as "Sentry" was incorporated
in the State of Delaware in 2015 as Functional Software, Inc. and is
gonna need some rights from me in order to utilize my contributions in
this here PR. So here's the deal: I retain all rights, title and
interest in and to my contributions, and by keeping this boilerplate
intact I confirm that Sentry can use, modify, copy, and redistribute my
contributions, under Sentry's choice of terms.

## EXTRA RESOURCES

- [Sentry Docs contributor guide](https://docs.sentry.io/contributing/)

---------

Co-authored-by: Ivana Kellyer <ivana.kellyer@sentry.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants