Skip to content

Commit 256a316

Browse files
author
Paweł Kędzia
committed
Merge branch 'features/anon'
2 parents f05261c + e55f778 commit 256a316

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

86 files changed

+3479
-251
lines changed

.version

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
0.2.3
1+
0.3.0

CHANGELOG.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,4 +11,6 @@
1111
| 0.2.0 | Add balancing strategies: `balanced`, `weighted`, `dynamic_weighted` and `first_available` which works for streaming and non streaming requests. Included Prometheus metrics logging via `/metrics` endpoint. First stage of `llm_router_lib` library, to simply usage of `llm-router-api`. |
1212
| 0.2.1 | Fix stream: OpenAI->Ollama, Ollama->OpenAI. Add Redis caching of availability of model providers (when using `first_available` strategy). Add `llm_router_web` module with simple flask-based frontend to manage llm-router config files. |
1313
| 0.2.2 | Update dockerfile and requirements. Fix routing with vLLM. |
14-
| 0.2.3 | New web configurator: Handling projects, configs for each user separately. First Available strategy is more powerful, a lot of improvements to efficiency. |
14+
| 0.2.3 | New web configurator: Handling projects, configs for each user separately. First Available strategy is more powerful, a lot of improvements to efficiency. |
15+
| 0.2.4 | Anonymizer module, integration anonymization with any endpoint (using dynamic payload analysis and full payload anonymisation), dedicated `/api/anonymize_text` endpoint as memory only anonymization. Whole router may be run in `FORCE_ANONYMISATION` mode. |
16+
| 0.3.0 | Anonymization available with three strategies: `fast_masker`, `genai`, `prov_masker`. |

README.md

Lines changed: 39 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,20 @@
11
## llm‑router
22

3-
A lightweight, extensible gateway that exposes a clean **REST** API for interacting with
4-
multiple Large Language Model (LLM) providers (OpenAI, Ollama, vLLM, etc.).
5-
It centralises request validation, prompt management, model configuration and logging,
6-
allowing your application to talk to any supported LLM through a single, consistent interface.
7-
8-
This project provides a robust solution for managing and routing requests to various LLM backends.
9-
It simplifies the integration of LLMs into your applications by offering a unified API
10-
and advanced features like load balancing strategies.
3+
**llm‑router** – a lightweight, modular ecosystem for building and interacting with Large Language Model (LLM) services.
4+
5+
- **llm_router_api** provides a unified REST proxy that can route requests to any supported LLM backend (
6+
OpenAI‑compatible, Ollama, vLLM, LM Studio, etc.), with built‑in load‑balancing, health checks, streaming responses
7+
and optional Prometheus metrics.
8+
- **llm_router_lib** is a Python SDK that wraps the API with typed request/response models, automatic retries, token
9+
handling and a rich exception hierarchy, letting developers focus on application logic rather than raw HTTP calls.
10+
- **llm_router_web** offers ready‑to‑use Flask UIs – an anonymizer UI that masks sensitive data and a configuration
11+
manager for model/user settings – demonstrating how to consume the router from a browser.
12+
- **Plugins** (e.g., the **fast_masker** plugin) deliver a rule‑based text anonymisation engine with a comprehensive set
13+
of Polish‑specific masking rules (emails, IPs, URLs, phone numbers, PESEL, NIP, KRS, REGON, monetary amounts, dates,
14+
etc.) and an extensible architecture for custom rules and validators.
15+
16+
All components run on Python 3.10+ using `virtualenv` and require only the listed dependencies, making the suite easy to
17+
install, extend, and deploy in both development and production environments.
1118

1219
---
1320

@@ -137,28 +144,30 @@ docker run \
137144

138145
### 3️⃣ Optional configuration (via environment)
139146

140-
| Variable | Description | Default |
141-
|-----------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------|
142-
| `LLM_ROUTER_PROMPTS_DIR` | Directory containing predefined system prompts. | `resources/prompts` |
143-
| `LLM_ROUTER_MODELS_CONFIG` | Path to the models configuration JSON file. | `resources/configs/models-config.json` |
144-
| `LLM_ROUTER_DEFAULT_EP_LANGUAGE` | Default language for endpoint prompts. | `pl` |
145-
| `LLM_ROUTER_TIMEOUT` | Timeout (seconds) for llm-router API calls. | `0` |
146-
| `LLM_ROUTER_EXTERNAL_TIMEOUT` | Timeout (seconds) for external model API calls. | `300` |
147-
| `LLM_ROUTER_LOG_FILENAME` | Name of the log file. | `llm-router.log` |
148-
| `LLM_ROUTER_LOG_LEVEL` | Logging level (e.g., INFO, DEBUG). | `INFO` |
149-
| `LLM_ROUTER_EP_PREFIX` | Prefix for all API endpoints. | `/api` |
150-
| `LLM_ROUTER_MINIMUM` | Run service in proxy‑only mode (boolean). | `False` |
151-
| `LLM_ROUTER_IN_DEBUG` | Run server in debug mode (boolean). | `False` |
152-
| `LLM_ROUTER_BALANCE_STRATEGY` | Strategy used to balance routing between LLM providers. Allowed values are `balanced`, `weighted`, `dynamic_weighted` and `first_available` as defined in `constants_base.py`. | `balanced` |
153-
| `LLM_ROUTER_REDIS_HOST` | Redis host for load‑balancing when a multi‑provider model is available. | `<empty string>` |
154-
| `LLM_ROUTER_REDIS_PORT` | Redis port for load‑balancing when a multi‑provider model is available. | `6379` |
155-
| `LLM_ROUTER_SERVER_TYPE` | Server implementation to use (`flask`, `gunicorn`, `waitress`). | `flask` |
156-
| `LLM_ROUTER_SERVER_PORT` | Port on which the server listens. | `8080` |
157-
| `LLM_ROUTER_SERVER_HOST` | Host address for the server. | `0.0.0.0` |
158-
| `LLM_ROUTER_SERVER_WORKERS_COUNT` | Number of workers (used in case when the selected server type supports multiworkers) | `2` |
159-
| `LLM_ROUTER_SERVER_THREADS_COUNT` | Number of workers threads (used in case when the selected server type supports multithreading) | `8` |
160-
| `LLM_ROUTER_SERVER_WORKER_CLASS` | If server accepts workers type, its able to set worker class by this environment. | `None` |
161-
| `LLM_ROUTER_USE_PROMETHEUS` | Enable Prometheus metrics collection.** When set to `True`, the router registers a `/metrics` endpoint exposing Prometheus‑compatible metrics for monitoring. | `False` |
147+
| Variable | Description | Default |
148+
|---------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------|
149+
| `LLM_ROUTER_PROMPTS_DIR` | Directory containing predefined system prompts. | `resources/prompts` |
150+
| `LLM_ROUTER_MODELS_CONFIG` | Path to the models configuration JSON file. | `resources/configs/models-config.json` |
151+
| `LLM_ROUTER_DEFAULT_EP_LANGUAGE` | Default language for endpoint prompts. | `pl` |
152+
| `LLM_ROUTER_TIMEOUT` | Timeout (seconds) for llm-router API calls. | `0` |
153+
| `LLM_ROUTER_EXTERNAL_TIMEOUT` | Timeout (seconds) for external model API calls. | `300` |
154+
| `LLM_ROUTER_LOG_FILENAME` | Name of the log file. | `llm-router.log` |
155+
| `LLM_ROUTER_LOG_LEVEL` | Logging level (e.g., INFO, DEBUG). | `INFO` |
156+
| `LLM_ROUTER_EP_PREFIX` | Prefix for all API endpoints. | `/api` |
157+
| `LLM_ROUTER_MINIMUM` | Run service in proxy‑only mode (boolean). | `False` |
158+
| `LLM_ROUTER_IN_DEBUG` | Run server in debug mode (boolean). | `False` |
159+
| `LLM_ROUTER_BALANCE_STRATEGY` | Strategy used to balance routing between LLM providers. Allowed values are `balanced`, `weighted`, `dynamic_weighted` and `first_available` as defined in `constants_base.py`. | `balanced` |
160+
| `LLM_ROUTER_REDIS_HOST` | Redis host for load‑balancing when a multi‑provider model is available. | `<empty string>` |
161+
| `LLM_ROUTER_REDIS_PORT` | Redis port for load‑balancing when a multi‑provider model is available. | `6379` |
162+
| `LLM_ROUTER_SERVER_TYPE` | Server implementation to use (`flask`, `gunicorn`, `waitress`). | `flask` |
163+
| `LLM_ROUTER_SERVER_PORT` | Port on which the server listens. | `8080` |
164+
| `LLM_ROUTER_SERVER_HOST` | Host address for the server. | `0.0.0.0` |
165+
| `LLM_ROUTER_SERVER_WORKERS_COUNT` | Number of workers (used in case when the selected server type supports multiworkers) | `2` |
166+
| `LLM_ROUTER_SERVER_THREADS_COUNT` | Number of workers threads (used in case when the selected server type supports multithreading) | `8` |
167+
| `LLM_ROUTER_SERVER_WORKER_CLASS` | If server accepts workers type, its able to set worker class by this environment. | `None` |
168+
| `LLM_ROUTER_USE_PROMETHEUS` | Enable Prometheus metrics collection.** When set to `True`, the router registers a `/metrics` endpoint exposing Prometheus‑compatible metrics for monitoring. | `False` |
169+
| `LLM_ROUTER_FORCE_ANONYMISATION` | Enable whole payload anonymisation. Each key and value is aut-anonymized before sending to model provider. | `False` |
170+
| `LLM_ROUTER_ENABLE_GENAI_ANONYMIZE_TEXT_EP` | Enable builtin endpoint `/api/anonymize_text_genai` which uses genai to anonymize text | `False` |
162171

163172
### 4️⃣ Run the REST API
164173

llm_router_api/README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -69,6 +69,7 @@ Configuration is driven primarily by environment variables and a JSON model‑co
6969
| `LLM_ROUTER_SERVER_WORKERS_COUNT` | Number of workers (Gunicorn/Waitress). | `2` |
7070
| `LLM_ROUTER_SERVER_THREADS_COUNT` | Number of threads per worker. | `8` |
7171
| `LLM_ROUTER_SERVER_WORKER_CLASS` | Gunicorn worker class (e.g., `gevent`). | *empty* |
72+
| `LLM_ROUTER_FORCE_ANONYMISATION` | Each request payload will be fully anonymised. | `False` |
7273

7374
### Model Configuration
7475

0 commit comments

Comments
 (0)