Python: Add LiteLLM integration for chat and responses clients (#194) #1845

drandahl · 2025-10-31T21:21:21Z

Motivation and Context

Why is this change required? Requested per Create Chat Client which used LiteLLM #194
What problem does it solve? Enables easier configuration across various model providers.
What scenario does it contribute to? Create Chat Client which used LiteLLM #194
If it fixes an open issue, please link to the issue here. Create Chat Client which used LiteLLM #194

Description

This change introduces an agent-framework-litellm package which extends the OpenAIBaseChatClient and OpenAIBaseResponsesClient into two new clients for access with LiteLLM. Because LiteLLM is mostly compatible with OpenAI, we begin translating the LiteLLM format into OpenAI format, then leveraging our existing OpenAI->AF translation layer to support the full translation to AF.

Note that we have noticed certain cases where LiteLLM may deviate from OpenAI, in particular to support special non-openai features (see https://docs.litellm.ai/docs/reasoning_content). This PR does not solve exhaustively for integration against all providers. The following have been manually tested:

Text content in completion and reasoning API with custom tool calls: OpenAI, Azure OpenAI, Anthropic (Note that at time of PR, there's an open issue blocking tool calls for Anthropic, so this will need to be verified again after that has been resolved)

Integration tests were written and executed against Azure OpenAI as provider, with tests copied directly from OpenAI client tests.

Two features are known to not currently be supported by this PR:

StructuredOutput for Responses client (that requires a responses.parse API being available, which is not present through LiteLLM).
HostedWebSearchTool, HostedCodeInterpreterTool, HostedFileSearchTool tool calls (requires investigation, was taken as out of scope for this work). Custom tool calls are confirmed to work.

Contribution Checklist

[ X ] The code builds clean without any errors or warnings
[ X ] The PR follows the Contribution Guidelines
[ X ] All unit tests pass, and I have added new tests where possible
[ X ] Is this a breaking change? If yes, add "[BREAKING]" prefix to the title of the PR.

…soft#194)

markwallace-microsoft · 2025-10-31T21:22:55Z

Python Test Coverage Report •

File	Stmts	Miss	Cover	Missing
TOTAL	11969	1841	84%

report-only-changed-files is enabled. No files were changed during this commit :)

Python Unit Test Overview

Tests	Skipped	Failures	Errors	Time
1531	123 💤	0 ❌	0 🔥	34.376s ⏱️

Copilot

Pull Request Overview

This PR adds LiteLLM integration to the Microsoft Agent Framework, enabling the framework to work with LiteLLM's unified interface for multiple LLM providers. The integration includes both chat completion and responses API clients with full test coverage and sample code.

Adds a new agent-framework-litellm package with LiteLlmChatClient and LiteLlmResponsesClient
Includes comprehensive unit and integration tests covering various scenarios
Provides sample code demonstrating usage of both clients
Updates configuration files and documentation to support the new integration

Reviewed Changes

Copilot reviewed 15 out of 17 changed files in this pull request and generated 10 comments.

Show a summary per file

File	Description
python/uv.lock	Adds agent-framework-litellm package dependencies to the lockfile
python/pyproject.toml	Registers the new litellm package in workspace dependencies
python/packages/litellm/*	Core implementation files for LiteLLM chat and responses clients
python/packages/litellm/tests/*	Comprehensive test suite with unit and integration tests
python/samples/getting_started/chat_client/*	Sample code demonstrating LiteLLM client usage
python/.env.example	Adds LITE_LLM_MODEL_ID environment variable
python/packages/core/pyproject.toml	Adds litellm to the "all" extras

Copilot · 2025-10-31T21:24:38Z

python/packages/litellm/agent_framework_litellm/_chat_client.py

+        return cast(ModelResponse, lite_llm_response)
+
+    def lite_llm_to_openai_completion(self, response: ModelResponse) -> ChatCompletion:
+        # Convert a LiteLLM ResponsesAPIResponse to an OpenAI Response. LiteLLM aims to match the OpenAI A  I,


Incorrect spacing in comment: 'OpenAI A I' should be 'OpenAI API'.

Suggested change

# Convert a LiteLLM ResponsesAPIResponse to an OpenAI Response. LiteLLM aims to match the OpenAI A I,

# Convert a LiteLLM ResponsesAPIResponse to an OpenAI Response. LiteLLM aims to match the OpenAI API,

Copilot · 2025-10-31T21:24:38Z

python/packages/litellm/agent_framework_litellm/_chat_client.py

+        # however in the future there may be differences that need to be accounted for here.
+        openai_event = cast(ChatCompletionChunk, event)
+
+        # LiteLLM does not providet this as a first-class field, so we map it here.


Corrected spelling of 'providet' to 'provide'.

Suggested change

# LiteLLM does not providet this as a first-class field, so we map it here.

# LiteLLM does not provide this as a first-class field, so we map it here.

Copilot · 2025-10-31T21:24:38Z

python/packages/litellm/agent_framework_litellm/_chat_client.py

+    def lite_llm_to_openai_response(self, response: ModelResponse) -> ChatCompletion:
+        """Convert a LiteLLM ModelResponse to an OpenAI ChatCompletion."""
+        # OpenAI parsing code currently directly checks for OpenAI classes. However,
+        # LiteLLM implements the OpenAI API via its own classes, compatable via


Corrected spelling of 'compatable' to 'compatible'.

Suggested change

# LiteLLM implements the OpenAI API via its own classes, compatable via

# LiteLLM implements the OpenAI API via its own classes, compatible via

Copilot · 2025-10-31T21:24:39Z

python/packages/litellm/README.md

+
+# Development Notes
+
+* LiteLLM doesn't have a generic way of setting config for model id (see https://docs.litellm.ai/docs/set_keys for more information on available environment vairiables). For this reason, we've added the "LITE_LLM_MODEL" env param which may be set. The integration still supports LiteLLM's list of provider-specific environment variables.


Corrected spelling of 'vairiables' to 'variables'.

Suggested change

* LiteLLM doesn't have a generic way of setting config for model id (see https://docs.litellm.ai/docs/set_keys for more information on available environment vairiables). For this reason, we've added the "LITE_LLM_MODEL" env param which may be set. The integration still supports LiteLLM's list of provider-specific environment variables.

* LiteLLM doesn't have a generic way of setting config for model id (see https://docs.litellm.ai/docs/set_keys for more information on available environment variables). For this reason, we've added the "LITE_LLM_MODEL" env param which may be set. The integration still supports LiteLLM's list of provider-specific environment variables.

Copilot · 2025-10-31T21:24:39Z

python/packages/litellm/README.md

+
+# Development Notes
+
+* LiteLLM doesn't have a generic way of setting config for model id (see https://docs.litellm.ai/docs/set_keys for more information on available environment vairiables). For this reason, we've added the "LITE_LLM_MODEL" env param which may be set. The integration still supports LiteLLM's list of provider-specific environment variables.


The documentation mentions 'LITE_LLM_MODEL' env param, but the code and other documentation use 'LITE_LLM_MODEL_ID'. This inconsistency should be corrected to use 'LITE_LLM_MODEL_ID' throughout.

Suggested change

* LiteLLM doesn't have a generic way of setting config for model id (see https://docs.litellm.ai/docs/set_keys for more information on available environment vairiables). For this reason, we've added the "LITE_LLM_MODEL" env param which may be set. The integration still supports LiteLLM's list of provider-specific environment variables.

* LiteLLM doesn't have a generic way of setting config for model id (see https://docs.litellm.ai/docs/set_keys for more information on available environment vairiables). For this reason, we've added the "LITE_LLM_MODEL_ID" env param which may be set. The integration still supports LiteLLM's list of provider-specific environment variables.

Copilot · 2025-10-31T21:24:39Z

python/packages/litellm/agent_framework_litellm/_chat_client.py

+    def lite_llm_to_openai_completion(self, response: ModelResponse) -> ChatCompletion:
+        # Convert a LiteLLM ResponsesAPIResponse to an OpenAI Response. LiteLLM aims to match the OpenAI A  I,
+        # however in the future there may be differences that need to be accounted for here.
+        return cast(ChatCompletion, response)
+


The method lite_llm_to_openai_completion appears to be unused. Line 153 calls lite_llm_to_openai_response instead, which has the same implementation. Consider removing this duplicate method or clarifying if it's intentionally kept for future use.

Suggested change

def lite_llm_to_openai_completion(self, response: ModelResponse) -> ChatCompletion:

# Convert a LiteLLM ResponsesAPIResponse to an OpenAI Response. LiteLLM aims to match the OpenAI A I,

# however in the future there may be differences that need to be accounted for here.

return cast(ChatCompletion, response)

Copilot · 2025-10-31T21:24:40Z

python/packages/litellm/tests/test_lite_llm_chat_client.py

+        assert second_response.text is not None
+        # Should NOT use the weather tool since it was only run-level in previous call
+        # Call count should still be 1 (no additional calls)
+        assert call_count == 1


Test is always true, because of this condition.

Copilot · 2025-10-31T21:24:40Z

python/packages/litellm/tests/test_lite_llm_responses_client.py

+        assert second_response.text is not None
+        # Should NOT use the weather tool since it was only run-level in previous call
+        # Call count should still be 1 (no additional calls)
+        assert call_count == 1


Test is always true, because of this condition.

Copilot · 2025-10-31T21:24:40Z

python/samples/getting_started/chat_client/litellm_chat_client.py

+                print(str(chunk), end="")
+        print("")
+    else:
+        response = await client.get_response(message, tools=get_weather)


This statement is unreachable.

Copilot · 2025-10-31T21:24:40Z

python/samples/getting_started/chat_client/litellm_responses_client.py

+    else:
+        response = await client.get_response(message, tools=[])
+        print(f"Assistant: {response}")


This statement is unreachable.

Suggested change

else:

response = await client.get_response(message, tools=[])

print(f"Assistant: {response}")

eavanvalkenburg · 2025-11-03T09:32:00Z

python/packages/litellm/agent_framework_litellm/_chat_client.py

+from pydantic import ValidationError
+
+
+class LiteLlmCompletionAISettings(AFBaseSettings):


Suggested change

class LiteLlmCompletionAISettings(AFBaseSettings):

class LiteLlmSettings(AFBaseSettings):

eavanvalkenburg · 2025-11-03T09:32:32Z

python/packages/litellm/agent_framework_litellm/_chat_client.py

+
+    env_prefix: ClassVar[str] = "LITE_LLM_"
+
+    model_id: str | None = None


why are api_key and api_base not part of these settings?

eavanvalkenburg · 2025-11-03T09:33:42Z

python/packages/litellm/agent_framework_litellm/_chat_client.py

+
+        super().__init__(api_base=api_base, api_key=api_key, model_id=self.model_id, client=None, **kwargs)  # type: ignore
+
+    def make_completion_streaming_request(self, **options_dict: Any) -> CustomStreamWrapper:


these should probably be internal methods:

Suggested change

def make_completion_streaming_request(self, **options_dict: Any) -> CustomStreamWrapper:

def _make_completion_streaming_request(self, **options_dict: Any) -> CustomStreamWrapper:

eavanvalkenburg · 2025-11-03T09:33:50Z

python/packages/litellm/agent_framework_litellm/_chat_client.py

+        # Completion conditionally returns this depending on streaming vs non-streaming
+        return cast(CustomStreamWrapper, lite_llm_response)
+
+    def make_completion_nonstreaming_request(self, **options_dict: Any) -> ModelResponse:


Suggested change

def make_completion_nonstreaming_request(self, **options_dict: Any) -> ModelResponse:

def _make_completion_nonstreaming_request(self, **options_dict: Any) -> ModelResponse:

eavanvalkenburg · 2025-11-03T09:33:58Z

python/packages/litellm/agent_framework_litellm/_chat_client.py

+        # Completion conditionally returns this depending on streaming vs non-streaming
+        return cast(ModelResponse, lite_llm_response)
+
+    def lite_llm_to_openai_completion(self, response: ModelResponse) -> ChatCompletion:


Suggested change

def lite_llm_to_openai_completion(self, response: ModelResponse) -> ChatCompletion:

def _lite_llm_to_openai_completion(self, response: ModelResponse) -> ChatCompletion:

eavanvalkenburg · 2025-11-03T09:34:19Z

python/packages/litellm/agent_framework_litellm/_chat_client.py

+        openai_event.usage = event.get("usage", None)
+        return openai_event
+
+    def lite_llm_to_openai_response(self, response: ModelResponse) -> ChatCompletion:


Suggested change

def lite_llm_to_openai_response(self, response: ModelResponse) -> ChatCompletion:

def _lite_llm_to_openai_response(self, response: ModelResponse) -> ChatCompletion:

eavanvalkenburg · 2025-11-03T09:37:21Z

python/packages/litellm/agent_framework_litellm/_chat_client.py

+
+    def make_completion_streaming_request(self, **options_dict: Any) -> CustomStreamWrapper:
+        options_dict["model"] = options_dict.pop("model_id")
+        lite_llm_response = completion(stream=True, **options_dict)


litellm supports async, we should use that: https://github.com/BerriAI/litellm?tab=readme-ov-file#async-docs

eavanvalkenburg · 2025-11-03T09:38:09Z

python/packages/litellm/agent_framework_litellm/_chat_client.py

+        **kwargs: Any,
+    ) -> ChatResponse:
+        options_dict = self._prepare_options(messages, chat_options)
+        options_dict["model_id"] = self.model_id


this should already be done in the prepare_options, and people should be able to set the model at runtime.

eavanvalkenburg · 2025-11-03T09:39:04Z

python/packages/litellm/agent_framework_litellm/_responses_client.py

+from pydantic import ValidationError
+
+
+class LiteLlmResponsesAISettings(AFBaseSettings):


please reuse a single LiteLlmSettings object, differentiate with CHAT_MODEL_ID and RESPONSES_MODEL_ID if needed (similar to openai settings)

eavanvalkenburg · 2025-11-03T09:39:41Z

python/packages/litellm/agent_framework_litellm/_responses_client.py

+
+        super().__init__(api_base=api_base, api_key=api_key, model_id=self.model_id, client=None, **kwargs)  # type: ignore
+
+    def make_responses_streaming_request(self, **options_dict: Any) -> SyncResponsesAPIStreamingIterator:


same comments as _chat_client, use async and make these methods internal

Python: Add LiteLLM integration for chat and responses clients (micro…

8bae84e

…soft#194)

Copilot AI review requested due to automatic review settings October 31, 2025 21:21

markwallace-microsoft added documentation Improvements or additions to documentation python labels Oct 31, 2025

drandahl mentioned this pull request Oct 31, 2025

Create Chat Client which used LiteLLM #194

Open

Copilot AI reviewed Oct 31, 2025

View reviewed changes

eavanvalkenburg reviewed Nov 3, 2025

View reviewed changes

	# Convert a LiteLLM ResponsesAPIResponse to an OpenAI Response. LiteLLM aims to match the OpenAI A I,
	# Convert a LiteLLM ResponsesAPIResponse to an OpenAI Response. LiteLLM aims to match the OpenAI API,

	# LiteLLM does not providet this as a first-class field, so we map it here.
	# LiteLLM does not provide this as a first-class field, so we map it here.

	# LiteLLM implements the OpenAI API via its own classes, compatable via
	# LiteLLM implements the OpenAI API via its own classes, compatible via


		# Development Notes

		* LiteLLM doesn't have a generic way of setting config for model id (see https://docs.litellm.ai/docs/set_keys for more information on available environment vairiables). For this reason, we've added the "LITE_LLM_MODEL" env param which may be set. The integration still supports LiteLLM's list of provider-specific environment variables.

	else:
	response = await client.get_response(message, tools=[])
	print(f"Assistant: {response}")

		from pydantic import ValidationError


		class LiteLlmCompletionAISettings(AFBaseSettings):

	class LiteLlmCompletionAISettings(AFBaseSettings):
	class LiteLlmSettings(AFBaseSettings):


		env_prefix: ClassVar[str] = "LITE_LLM_"

		model_id: str \| None = None


		super().__init__(api_base=api_base, api_key=api_key, model_id=self.model_id, client=None, **kwargs) # type: ignore

		def make_completion_streaming_request(self, **options_dict: Any) -> CustomStreamWrapper:

	def make_completion_streaming_request(self, **options_dict: Any) -> CustomStreamWrapper:
	def _make_completion_streaming_request(self, **options_dict: Any) -> CustomStreamWrapper:

	def make_completion_nonstreaming_request(self, **options_dict: Any) -> ModelResponse:
	def _make_completion_nonstreaming_request(self, **options_dict: Any) -> ModelResponse:

	def lite_llm_to_openai_completion(self, response: ModelResponse) -> ChatCompletion:
	def _lite_llm_to_openai_completion(self, response: ModelResponse) -> ChatCompletion:

	def lite_llm_to_openai_response(self, response: ModelResponse) -> ChatCompletion:
	def _lite_llm_to_openai_response(self, response: ModelResponse) -> ChatCompletion:

		from pydantic import ValidationError


		class LiteLlmResponsesAISettings(AFBaseSettings):


		super().__init__(api_base=api_base, api_key=api_key, model_id=self.model_id, client=None, **kwargs) # type: ignore

		def make_responses_streaming_request(self, **options_dict: Any) -> SyncResponsesAPIStreamingIterator:

Python: Add LiteLLM integration for chat and responses clients (#194) #1845

Are you sure you want to change the base?

Python: Add LiteLLM integration for chat and responses clients (#194) #1845

Conversation

drandahl commented Oct 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation and Context

Description

Contribution Checklist

Uh oh!

markwallace-microsoft commented Oct 31, 2025

Python Unit Test Overview

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Oct 31, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 31, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 31, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 31, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 31, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 31, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 31, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 31, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 31, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 31, 2025

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

drandahl commented Oct 31, 2025 •

edited

Loading