## Summary Add support for [Ollama](https://ollama.com/) as a local LLM provider in the Copilot SDK, enabling users to run AI-powered features using locally hosted open-source models without relying on cloud-based API services. ## Problem / Use Case Currently, the Copilot SDK relies on cloud-based LLM providers, which may not be suitable for all use cases: - **Privacy & Data Sensitivity**: Some users and organizations need to keep their data on-premises and cannot send it to external APIs. - **Cost**: Cloud API usage can become expensive at scale. Running models locally with Ollama eliminates per-token costs. - **Offline Access**: Developers working in air-gapped or limited-connectivity environments need a local inference option. - **Flexibility**: Ollama supports a wide range of open-source models (Llama 3, Mistral, Phi, Gemma, etc.), giving users the freedom to choose the model that best fits their needs. ## Proposed Solution Integrate Ollama as a supported LLM provider in the SDK. This could include: 1. A new provider configuration option for Ollama 2. Support for specifying the Ollama server URL (default: `http://localhost:11434`) 3. Model selection from locally available Ollama models 4. Streaming response support via the Ollama API ## Alternatives Considered - **LM Studio**: Another local inference tool, but Ollama has broader community adoption and a simpler API. - **llama.cpp direct integration**: Lower-level and harder to maintain; Ollama provides a stable HTTP API layer on top. - **Self-hosted cloud APIs**: Still requires infrastructure management and doesn't solve the offline use case. ## Additional Context Ollama provides an OpenAI-compatible API endpoint, which may simplify the integration effort if the SDK already supports OpenAI-style APIs. The `/api/chat` and `/api/generate` endpoints follow a similar request/response pattern.