|
1 | 1 | ## llm‑router |
2 | 2 |
|
3 | | -A lightweight, extensible gateway that exposes a clean **REST** API for interacting with |
4 | | -multiple Large Language Model (LLM) providers (OpenAI, Ollama, vLLM, etc.). |
5 | | -It centralises request validation, prompt management, model configuration and logging, |
6 | | -allowing your application to talk to any supported LLM through a single, consistent interface. |
7 | | - |
8 | | -This project provides a robust solution for managing and routing requests to various LLM backends. |
9 | | -It simplifies the integration of LLMs into your applications by offering a unified API |
10 | | -and advanced features like load balancing strategies. |
| 3 | +**llm‑router** – a lightweight, modular ecosystem for building and interacting with Large Language Model (LLM) services. |
| 4 | + |
| 5 | +- **llm_router_api** provides a unified REST proxy that can route requests to any supported LLM backend ( |
| 6 | + OpenAI‑compatible, Ollama, vLLM, LM Studio, etc.), with built‑in load‑balancing, health checks, streaming responses |
| 7 | + and optional Prometheus metrics. |
| 8 | +- **llm_router_lib** is a Python SDK that wraps the API with typed request/response models, automatic retries, token |
| 9 | + handling and a rich exception hierarchy, letting developers focus on application logic rather than raw HTTP calls. |
| 10 | +- **llm_router_web** offers ready‑to‑use Flask UIs – an anonymizer UI that masks sensitive data and a configuration |
| 11 | + manager for model/user settings – demonstrating how to consume the router from a browser. |
| 12 | +- **Plugins** (e.g., the **fast_masker** plugin) deliver a rule‑based text anonymisation engine with a comprehensive set |
| 13 | + of Polish‑specific masking rules (emails, IPs, URLs, phone numbers, PESEL, NIP, KRS, REGON, monetary amounts, dates, |
| 14 | + etc.) and an extensible architecture for custom rules and validators. |
| 15 | + |
| 16 | +All components run on Python 3.10+ using `virtualenv` and require only the listed dependencies, making the suite easy to |
| 17 | +install, extend, and deploy in both development and production environments. |
11 | 18 |
|
12 | 19 | --- |
13 | 20 |
|
@@ -137,28 +144,30 @@ docker run \ |
137 | 144 |
|
138 | 145 | ### 3️⃣ Optional configuration (via environment) |
139 | 146 |
|
140 | | -| Variable | Description | Default | |
141 | | -|-----------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------| |
142 | | -| `LLM_ROUTER_PROMPTS_DIR` | Directory containing predefined system prompts. | `resources/prompts` | |
143 | | -| `LLM_ROUTER_MODELS_CONFIG` | Path to the models configuration JSON file. | `resources/configs/models-config.json` | |
144 | | -| `LLM_ROUTER_DEFAULT_EP_LANGUAGE` | Default language for endpoint prompts. | `pl` | |
145 | | -| `LLM_ROUTER_TIMEOUT` | Timeout (seconds) for llm-router API calls. | `0` | |
146 | | -| `LLM_ROUTER_EXTERNAL_TIMEOUT` | Timeout (seconds) for external model API calls. | `300` | |
147 | | -| `LLM_ROUTER_LOG_FILENAME` | Name of the log file. | `llm-router.log` | |
148 | | -| `LLM_ROUTER_LOG_LEVEL` | Logging level (e.g., INFO, DEBUG). | `INFO` | |
149 | | -| `LLM_ROUTER_EP_PREFIX` | Prefix for all API endpoints. | `/api` | |
150 | | -| `LLM_ROUTER_MINIMUM` | Run service in proxy‑only mode (boolean). | `False` | |
151 | | -| `LLM_ROUTER_IN_DEBUG` | Run server in debug mode (boolean). | `False` | |
152 | | -| `LLM_ROUTER_BALANCE_STRATEGY` | Strategy used to balance routing between LLM providers. Allowed values are `balanced`, `weighted`, `dynamic_weighted` and `first_available` as defined in `constants_base.py`. | `balanced` | |
153 | | -| `LLM_ROUTER_REDIS_HOST` | Redis host for load‑balancing when a multi‑provider model is available. | `<empty string>` | |
154 | | -| `LLM_ROUTER_REDIS_PORT` | Redis port for load‑balancing when a multi‑provider model is available. | `6379` | |
155 | | -| `LLM_ROUTER_SERVER_TYPE` | Server implementation to use (`flask`, `gunicorn`, `waitress`). | `flask` | |
156 | | -| `LLM_ROUTER_SERVER_PORT` | Port on which the server listens. | `8080` | |
157 | | -| `LLM_ROUTER_SERVER_HOST` | Host address for the server. | `0.0.0.0` | |
158 | | -| `LLM_ROUTER_SERVER_WORKERS_COUNT` | Number of workers (used in case when the selected server type supports multiworkers) | `2` | |
159 | | -| `LLM_ROUTER_SERVER_THREADS_COUNT` | Number of workers threads (used in case when the selected server type supports multithreading) | `8` | |
160 | | -| `LLM_ROUTER_SERVER_WORKER_CLASS` | If server accepts workers type, its able to set worker class by this environment. | `None` | |
161 | | -| `LLM_ROUTER_USE_PROMETHEUS` | Enable Prometheus metrics collection.** When set to `True`, the router registers a `/metrics` endpoint exposing Prometheus‑compatible metrics for monitoring. | `False` | |
| 147 | +| Variable | Description | Default | |
| 148 | +|---------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------| |
| 149 | +| `LLM_ROUTER_PROMPTS_DIR` | Directory containing predefined system prompts. | `resources/prompts` | |
| 150 | +| `LLM_ROUTER_MODELS_CONFIG` | Path to the models configuration JSON file. | `resources/configs/models-config.json` | |
| 151 | +| `LLM_ROUTER_DEFAULT_EP_LANGUAGE` | Default language for endpoint prompts. | `pl` | |
| 152 | +| `LLM_ROUTER_TIMEOUT` | Timeout (seconds) for llm-router API calls. | `0` | |
| 153 | +| `LLM_ROUTER_EXTERNAL_TIMEOUT` | Timeout (seconds) for external model API calls. | `300` | |
| 154 | +| `LLM_ROUTER_LOG_FILENAME` | Name of the log file. | `llm-router.log` | |
| 155 | +| `LLM_ROUTER_LOG_LEVEL` | Logging level (e.g., INFO, DEBUG). | `INFO` | |
| 156 | +| `LLM_ROUTER_EP_PREFIX` | Prefix for all API endpoints. | `/api` | |
| 157 | +| `LLM_ROUTER_MINIMUM` | Run service in proxy‑only mode (boolean). | `False` | |
| 158 | +| `LLM_ROUTER_IN_DEBUG` | Run server in debug mode (boolean). | `False` | |
| 159 | +| `LLM_ROUTER_BALANCE_STRATEGY` | Strategy used to balance routing between LLM providers. Allowed values are `balanced`, `weighted`, `dynamic_weighted` and `first_available` as defined in `constants_base.py`. | `balanced` | |
| 160 | +| `LLM_ROUTER_REDIS_HOST` | Redis host for load‑balancing when a multi‑provider model is available. | `<empty string>` | |
| 161 | +| `LLM_ROUTER_REDIS_PORT` | Redis port for load‑balancing when a multi‑provider model is available. | `6379` | |
| 162 | +| `LLM_ROUTER_SERVER_TYPE` | Server implementation to use (`flask`, `gunicorn`, `waitress`). | `flask` | |
| 163 | +| `LLM_ROUTER_SERVER_PORT` | Port on which the server listens. | `8080` | |
| 164 | +| `LLM_ROUTER_SERVER_HOST` | Host address for the server. | `0.0.0.0` | |
| 165 | +| `LLM_ROUTER_SERVER_WORKERS_COUNT` | Number of workers (used in case when the selected server type supports multiworkers) | `2` | |
| 166 | +| `LLM_ROUTER_SERVER_THREADS_COUNT` | Number of workers threads (used in case when the selected server type supports multithreading) | `8` | |
| 167 | +| `LLM_ROUTER_SERVER_WORKER_CLASS` | If server accepts workers type, its able to set worker class by this environment. | `None` | |
| 168 | +| `LLM_ROUTER_USE_PROMETHEUS` | Enable Prometheus metrics collection.** When set to `True`, the router registers a `/metrics` endpoint exposing Prometheus‑compatible metrics for monitoring. | `False` | |
| 169 | +| `LLM_ROUTER_FORCE_ANONYMISATION` | Enable whole payload anonymisation. Each key and value is aut-anonymized before sending to model provider. | `False` | |
| 170 | +| `LLM_ROUTER_ENABLE_GENAI_ANONYMIZE_TEXT_EP` | Enable builtin endpoint `/api/anonymize_text_genai` which uses genai to anonymize text | `False` | |
162 | 171 |
|
163 | 172 | ### 4️⃣ Run the REST API |
164 | 173 |
|
|
0 commit comments