-
Notifications
You must be signed in to change notification settings - Fork 938
Description
Summary
This issue tracks comprehensive improvements to the Oracle Cloud Infrastructure (OCI) Generative AI provider, addressing critical bugs and adding support for the full range of models available on OCI GenAI.
Related PR: #1537
Why Oracle GenAI Integration Matters
Oracle GenAI provides access to a unique combination of frontier models not available together on any other cloud platform:
| Vendor | Models | Unique Value |
|---|---|---|
| Meta | Llama 3.3, Llama 4 Maverick | Latest open-source LLMs |
| Gemini 2.5 Flash/Pro | Multimodal, long context | |
| OpenAI | GPT-4o, GPT-5, GPT-OSS | Frontier reasoning models |
| xAI | Grok 3, Grok 4.1 | Fast reasoning capabilities |
| Cohere | Command A, Command R+ | Enterprise RAG, multilingual |
Many enterprise customers use OCI as their primary cloud and need Portkey to route to these models reliably.
Critical Bug Fixed: Streaming Tool Calls
The current production gateway has a critical bug where streaming responses for tool/function calling don't return tool call data. This completely breaks agentic workflows.
Before (Current Production)
# Model wants to call a tool, but data is missing
AgentResult(
stop_reason='end_turn', # ❌ Should be 'tool_use'
message={'content': []}, # ❌ Tool call data missing
)Test results with current gateway:
| Provider | Basic Chat | Tool Use (Streaming) |
|---|---|---|
| OpenAI | ✅ Pass | ✅ Pass |
| Anthropic | ✅ Pass | ✅ Pass |
| OCI (all models) | ✅ Pass | ❌ Fail |
After (With PR #1537)
All OCI models pass tool calling tests:
- ✅ Single tool calls
- ✅ Multiple sequential tools
- ✅ Parallel tool calls (3+ simultaneous)
- ✅ Streaming tool calls
Improvements in PR #1537
1. Streaming Tool Call State Management
- Proper buffering of tool call chunks across SSE events
- Index tracking for parallel tool calls
- Correct
contentBlockStart/contentBlockDelta/contentBlockStopevent emission
2. Multi-Model Family Support
Each model family on OCI has different behaviors and parameters:
| Model Family | Special Handling |
|---|---|
meta.llama |
Standard parameters |
google.gemini |
Reasoning tokens support |
openai.gpt-4o |
Standard OpenAI format |
openai.gpt-5 |
maxCompletionTokens instead of max_tokens |
openai.gpt-oss |
reasoningContent field handling |
xai.grok |
Fast inference optimizations |
xai.grok-*-reasoning |
Reasoning tokens support |
cohere.command |
Cohere-specific parameters |
3. Embeddings Support
- Full Cohere embedding model support
- All input types (search_document, search_query, classification, clustering)
4. OCI Signature Authentication
- Complete OCI API key authentication with request signing
- Proper header handling for tenancy, user, fingerprint, and private key
Test Coverage
- 61 unit tests passing
- Integration tests verified with real API calls across all model families
- Parallel tool calling verified with 3 simultaneous tools on all models
Why Keep Oracle Integration Up to Date
- Enterprise Adoption: Many Fortune 500 companies use OCI exclusively
- Unique Model Access: Only place to get GPT-5, Grok 4, Gemini 2.5, and Llama 4 on the same platform
- Rapid Model Releases: Oracle adds new models frequently; the integration needs to stay current
- Agentic AI Growth: Tool calling is essential for modern AI applications
I'm a Senior Principal Engineer at Oracle working on GenAI integrations and am committed to maintaining this provider to ensure it stays current with our product roadmap.
Request
Please review and merge PR #1537 to unblock agentic workflows for OCI GenAI users.