Skip to content

Unified .NET client for interacting with popular local LLM providers like KoboldCpp, Ollama, LM Studio, and Oobabooga.

License

Notifications You must be signed in to change notification settings

SpongeEngine/LLMSharp

Repository files navigation

LLMSharp (In Progress)

NuGet NuGet Downloads Tests License .NET

Unified .NET client for interacting with popular local LLM providers like KoboldCpp, Ollama, LM Studio, and Oobabooga.

  • Single API: Use the same code regardless of the underlying LLM provider
  • Provider Flexibility: Easily switch between KoboldCpp, Ollama, LM Studio, or Text Generation WebUI
  • Modern .NET: Async/await, streaming support, and comprehensive logging

📦 View Package on NuGet

Feature Comparison

Feature LLMSharp OpenAI.NET LLamaSharp OllamaSharp
Local LLM Support ✅ ❌ ✅ ✅
Multiple Providers ✅ ❌ ❌ ❌
KoboldCpp Support ✅ ❌ ❌ ❌
Ollama Support ✅ ❌ ❌ ✅
LM Studio Support ✅ ❌ ❌ ❌
Text Gen WebUI Support ✅ ❌ ❌ ❌
Streaming âś… âś… âś… âś…
OpenAI Compatible ✅ ✅ ❌ ✅
Progress Tracking ✅ ❌ ❌ ❌
Retry Policies ✅ ❌ ❌ ❌
Circuit Breaker ✅ ❌ ❌ ❌
.NET Standard 2.0 ❌ ✅ ✅ ✅
.NET 6.0+ âś… âś… âś… âś…
classDiagram
    %% Core Interfaces
    class ICompletionService {
        <<interface>>
        +CompleteAsync(CompletionRequest)*
        +StreamCompletionAsync(CompletionRequest)*
    }
    class IChatService {
        <<interface>>
        +ChatCompleteAsync(ChatRequest)*
        +StreamChatAsync(ChatRequest)*
    }
    class IModelMetadata {
        <<interface>>
        +GetAvailableModelsAsync()*
        +GetModelInfoAsync(string)*
    }

    %% Base Abstract Client
    class LlmClient {
        <<abstract>>
        #HttpClient _httpClient
        #ILogger _logger
        #LlmOptions _options
        #RetryPolicy _retryPolicy
        #CircuitBreaker _circuitBreaker
        #ExecuteWithResilienceAsync[T]()
        +IsAvailableAsync()
    }

    %% Providers 
    class OobaboogaSharp {
        +CompleteAsync()
        +StreamCompletionAsync()
        +ChatCompleteAsync()
        +GetAvailableModels()
    }
    class LmStudioSharp {
        +CompleteAsync()
        +StreamCompletionAsync()
        +ChatCompleteAsync()
    }
    class KoboldSharp {
        +CompleteAsync()
        +StreamCompletionAsync()
    }
    class Gpt4AllSharp {
        +ChatCompleteAsync()
        +StreamChatAsync()
        +GetAvailableModels()
    }

    %% Base Inheritance
    LlmClient --> OobaboogaSharp
    LlmClient --> LmStudioSharp
    LlmClient --> KoboldSharp
    LlmClient --> Gpt4AllSharp

    %% Interface Implementation
    ICompletionService <.. OobaboogaSharp
    IChatService <.. OobaboogaSharp
    IModelMetadata <.. OobaboogaSharp

    ICompletionService <.. LmStudioSharp
    IChatService <.. LmStudioSharp

    ICompletionService <.. KoboldSharp

    IChatService <.. Gpt4AllSharp
    IModelMetadata <.. Gpt4AllSharp
Loading

Supported Providers

  • KoboldCpp: Both native and OpenAI-compatible modes
  • Ollama: Run Llama 2, Code Llama, and other models locally (using OllamaSharp).
  • LM Studio: Local deployment of various open-source models
  • Text Generation WebUI: Popular web interface for running local models

Installation

Install LLMSharp via NuGet:

dotnet add package LLMSharp

Quick Start

using LLMSharp.Client;
using LLMSharp.Models.Configuration;

// Create client with KoboldCpp provider
var options = new LocalAIOptions
{
    BaseUrl = "http://localhost:5000",
    ProviderOptions = new KoboldCppNativeOptions
    {
        ContextSize = 2048,
        UseGpu = true,
        RepetitionPenalty = 1.1f
    }
};

using var client = new LocalAIClient(options);

// Generate text completion
string response = await client.CompleteAsync("Write a short story about a robot:");

// Stream completion tokens
await foreach (var token in client.StreamCompletionAsync("Once upon a time..."))
{
    Console.Write(token);
}

// List available models
var models = await client.GetAvailableModelsAsync();
foreach (var model in models)
{
    Console.WriteLine($"Model: {model.Name} (Provider: {model.Provider})");
    Console.WriteLine($"Context Length: {model.Capabilities.MaxContextLength}");
}

Provider Configuration

KoboldCpp (Native)

var options = new LocalAIOptions
{
    BaseUrl = "http://localhost:5000",
    ProviderOptions = new KoboldCppNativeOptions
    {
        ContextSize = 2048,
        UseGpu = true,
        RepetitionPenalty = 1.1f,
        RepetitionPenaltyRange = 320,
        TrimStop = true,
        Mirostat = new MirostatSettings
        {
            Mode = 2,
            Tau = 5.0f,
            Eta = 0.1f
        }
    }
};

KoboldCpp (OpenAI-compatible)

var options = new LocalAIOptions
{
    BaseUrl = "http://localhost:5000",
    ProviderOptions = new KoboldCppOpenAiOptions
    {
        ContextSize = 2048,
        UseGpu = true,
        ModelName = "koboldcpp",
        UseChatCompletions = true
    }
};

Ollama

var options = new LocalAIOptions
{
    BaseUrl = "http://localhost:11434",
    ProviderOptions = new OllamaOptions
    {
        ConcurrentRequests = 1
    }
};

LM Studio

var options = new LocalAIOptions
{
    BaseUrl = "http://localhost:1234",
    ProviderOptions = new LMStudioOptions
    {
        UseOpenAIEndpoint = true
    }
};

Text Generation WebUI

var options = new LocalAIOptions
{
    BaseUrl = "http://localhost:7860",
    ProviderOptions = new TextGenWebOptions
    {
        UseOpenAIEndpoint = true
    }
};

Completion Options

var options = new CompletionOptions
{
    ModelName = "wizardLM",         // Optional model name
    MaxTokens = 200,                // Max tokens to generate
    Temperature = 0.7f,             // Randomness (0.0-1.0)
    TopP = 0.9f,                    // Nucleus sampling threshold
    StopSequences = new[] { "\n" }  // Sequences that stop generation
};

string response = await client.CompleteAsync("Your prompt here", options);

Progress Tracking

client.OnProgress += (progress) =>
{
    switch (progress.State)
    {
        case LocalAIProgressState.Starting:
            Console.WriteLine("Starting completion...");
            break;
        case LocalAIProgressState.Processing:
            Console.WriteLine($"Processing: {progress.Message}");
            break;
        case LocalAIProgressState.Streaming:
            Console.WriteLine("Receiving tokens...");
            break;
        case LocalAIProgressState.Complete:
            Console.WriteLine("Completion finished!");
            break;
        case LocalAIProgressState.Failed:
            Console.WriteLine($"Error: {progress.Message}");
            break;
    }
};

Error Handling

try
{
    var response = await client.CompleteAsync("Test prompt");
}
catch (LocalAIException ex)
{
    Console.WriteLine($"LocalAI API error: {ex.Message}");
    if (ex.StatusCode.HasValue)
    {
        Console.WriteLine($"Status code: {ex.StatusCode}");
    }
    if (ex.Provider != null)
    {
        Console.WriteLine($"Provider: {ex.Provider}");
    }
}
catch (Exception ex)
{
    Console.WriteLine($"General error: {ex.Message}");
}

Advanced Configuration

var options = new LocalAIOptions
{
    BaseUrl = "http://localhost:5000",
    ApiKey = "optional_api_key",
    Timeout = TimeSpan.FromMinutes(2),
    MaxRetryAttempts = 3,
    RetryDelay = TimeSpan.FromSeconds(2),
    Logger = loggerInstance,
    JsonSettings = new JsonSerializerSettings(),
    ProviderOptions = new KoboldCppNativeOptions()
};

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Please see CONTRIBUTING.md for details on:

  • How to publish to NuGet
  • Development guidelines
  • Code style
  • Testing requirements
  • Pull request process

Support

For issues and feature requests, please use the GitHub issues page.