Privacy proxy for LLM traffic. Detect, mask, and unmask PII in real-time.
Rust-native · <5ms latency · 30+ entity types · OpenAI-compatible · Local-first
Website · Docs · Cloud Dashboard · Discord
CloakPipe is a high-performance privacy proxy that sits between your application and any LLM API. It detects PII (personally identifiable information) in your prompts, replaces it with safe tokens, forwards the sanitized request to the LLM, and restores the original values in the response.
The LLM never sees your real data. Your users see natural responses.
Your App ──▶ CloakPipe ──▶ OpenAI / Anthropic / Any LLM
│
Detect → Mask → Proxy → Unmask
│
Encrypted Vault
(AES-256-GCM)
# Start CloakPipe
docker run -p 3100:3100 ghcr.io/cloakpipe/cloakpipe:latest
# Point your OpenAI SDK at CloakPipe
export OPENAI_BASE_URL=http://localhost:3100/v1
# Done. All LLM calls now go through CloakPipe.# Install via cargo
cargo install cloakpipe
# Or download the latest release
curl -fsSL https://cloakpipe.co/install.sh | sh
# Start the proxy
cloakpipe serve --port 3100curl http://localhost:3100/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
"model": "gpt-4",
"messages": [
{"role": "user", "content": "Summarize the case for Rajesh Singh, Aadhaar 2345 6789 0123, treated at Apollo Hospital Mumbai."}
]
}'
# CloakPipe logs:
# ✓ Detected 3 entities: PERSON, AADHAAR, ORGANIZATION
# ✓ Masked: Rajesh Singh → PERSON_042, 2345 6789 0123 → AADHAAR_017, Apollo Hospital Mumbai → ORG_003
# ✓ Proxied to api.openai.com (sanitized)
# ✓ Unmasked response: PERSON_042 → Rajesh Singh (restored)Summarize the medical history of Dr. Rajesh Singh (Aadhaar: 2345 6789 0123), treated at Apollo Hospital Mumbai for cardiac issues since March 2024.
Summarize the medical history of PERSON_042 (Aadhaar: AADHAAR_017), treated at ORG_003 for cardiac issues since DATE_012.
Dr. Rajesh Singh has been under cardiac care at Apollo Hospital Mumbai since March 2024. The treatment history includes...
The LLM generates a coherent response using the tokens. CloakPipe restores the original values before returning to your app. The model never saw the real data.
| CloakPipe | Presidio | Protecto | LLMGuard | |
|---|---|---|---|---|
| Language | Rust | Python | Python | Python |
| Latency | <5ms | 50–200ms | 50–200ms | 50–200ms |
| Mode | Drop-in proxy | Library | Cloud SaaS | Library |
| Reversible masking | ✅ Encrypted vault | ❌ Permanent redaction | ✅ Cloud vault | ❌ Permanent |
| India PII | ✅ Aadhaar, PAN, UPI | ❌ | Partial | ❌ |
| Self-hosted | ✅ Single binary | ✅ | Partial | ✅ |
| MCP support | ✅ (via Cloud) | ❌ | ❌ | ❌ |
| Price | Free (open source) | Free | $$$$ | Free |
| Dependencies | 0 (single binary) | Python + spaCy | Python + cloud | Python + PyTorch |
CloakPipe uses a three-layer detection system for speed and accuracy:
Input Text
│
▼
┌─────────────────────────────────────┐
│ Layer 1: Regex Pre-Filter │ <1ms
│ Aadhaar, PAN, email, phone, │
│ credit card, SSN, IP address │
│ Catches ~60% of PII instantly │
├─────────────────────────────────────┤
│ Layer 2: ONNX NER Model │ ~3ms
│ GLiNER2 transformer-based NER │
│ Context-aware: names, orgs, │
│ medical terms, addresses │
├─────────────────────────────────────┤
│ Layer 3: Fuzzy Entity Resolution │ <1ms
│ Jaro-Winkler similarity matching │
│ Links "Dr. R. Singh" and │
│ "Rajesh Singh" as same entity │
└─────────────────────────────────────┘
│
▼
Masked Output (total: <5ms)
Tokens are deterministic within a session — the same entity always maps to the same token. This means the LLM maintains coherence across the conversation.
Tokens are non-deterministic across sessions — the same entity maps to a different token in a new session, preventing cross-session correlation.
All entity ↔ token mappings are stored in a local vault encrypted with AES-256-GCM. The vault never leaves your infrastructure. There is no cloud dependency.
| Entity | Example | Detection |
|---|---|---|
| Person Name | John Smith, Dr. Priya Sharma | NER |
| Email Address | user@example.com | Regex |
| Phone Number | +1-555-0123, +91 98765 43210 | Regex |
| Credit Card | 4532-1234-5678-9012 | Regex + Luhn |
| SSN | 123-45-6789 | Regex |
| Date of Birth | 15/03/1990, March 15, 1990 | NER |
| Address | 123 MG Road, Pune 411001 | NER |
| IP Address | 192.168.1.1, 2001:db8::1 | Regex |
| Organization | Apollo Hospital, HDFC Bank | NER |
| Medical Term | diabetes, cardiac arrest | NER |
| Bank Account | IFSC + account number | Regex |
| Passport Number | J1234567 | Regex |
| License Plate | MH 12 AB 1234 | Regex |
| URL | https://internal.company.com | Regex |
| API Key | sk-live_xxx, AKIA... | Regex |
| Entity | Format | Example |
|---|---|---|
| Aadhaar Number | 12 digits (XXXX XXXX XXXX) | 2345 6789 0123 |
| PAN Card | ABCDE1234F | BNZPM2501F |
| UPI ID | name@bank | rajesh@okicici |
| Indian Phone | +91 XXXXX XXXXX | +91 98765 43210 |
| GSTIN | 15-char alphanumeric | 27AAPFU0939F1ZV |
| Indian Passport | Letter + 7 digits | J1234567 |
No other open-source LLM privacy tool handles Indian PII natively.
from openai import OpenAI
# Just change the base URL. That's it.
client = OpenAI(
base_url="http://localhost:3100/v1", # CloakPipe proxy
api_key="sk-your-openai-key" # Your real API key
)
response = client.chat.completions.create(
model="gpt-4",
messages=[
{"role": "user", "content": "Analyze the account for Priya Sharma, PAN BNZPM2501F"}
]
)
# CloakPipe detected PAN and person name, masked them,
# sent sanitized prompt to OpenAI, and unmasked the response.
print(response.choices[0].message.content)from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
model="gpt-4",
openai_api_base="http://localhost:3100/v1", # CloakPipe proxy
openai_api_key="sk-your-key"
)
response = llm.invoke("Summarize patient records for Aadhaar 2345 6789 0123")from anthropic import Anthropic
client = Anthropic(
base_url="http://localhost:3100/v1/anthropic", # CloakPipe proxy
api_key="sk-ant-your-key"
)
message = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[
{"role": "user", "content": "Review the loan application for Amit Patel, PAN ABCDE1234F"}
]
)# Works with any LLM API that uses the OpenAI format
curl http://localhost:3100/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
"model": "gpt-4",
"messages": [{"role": "user", "content": "Your prompt with PII here"}]
}'import { openai } from '@ai-sdk/openai';
import { generateText } from 'ai';
const result = await generateText({
model: openai('gpt-4', {
baseURL: 'http://localhost:3100/v1', // CloakPipe proxy
}),
prompt: 'Analyze the customer data for Rajesh, Aadhaar 2345 6789 0123',
});# Scan text for PII (no proxy, just detection)
cloakpipe scan "Dr. Rajesh Singh, Aadhaar 2345 6789 0123"
# Output:
# ✓ PERSON: "Dr. Rajesh Singh" (confidence: 0.97)
# ✓ AADHAAR: "2345 6789 0123" (confidence: 1.00)
# Mask text (replace PII with tokens)
cloakpipe mask "Contact Priya at priya@example.com or +91 98765 43210"
# Output: "Contact PERSON_001 at EMAIL_001 or PHONE_001"
# Start the proxy server
cloakpipe serve --port 3100
# Start with a specific policy
cloakpipe serve --port 3100 --policy policies/dpdp.yaml
# Check proxy health
cloakpipe health# Proxy settings
CLOAKPIPE_PORT=3100 # Proxy port (default: 3100)
CLOAKPIPE_HOST=0.0.0.0 # Bind address (default: 0.0.0.0)
CLOAKPIPE_LOG_LEVEL=info # Log level: debug, info, warn, error
# LLM provider
CLOAKPIPE_UPSTREAM_URL=https://api.openai.com # Default upstream LLM API
CLOAKPIPE_TIMEOUT=30 # Request timeout in seconds
# Detection
CLOAKPIPE_POLICY=policies/dpdp.yaml # Policy file path
CLOAKPIPE_MIN_CONFIDENCE=0.8 # Minimum NER confidence threshold (0.0–1.0)
# Vault
CLOAKPIPE_VAULT_PATH=./vault.db # Encrypted vault file path
CLOAKPIPE_VAULT_KEY= # 256-bit encryption key (auto-generated if empty)
# Cloud (optional, for dashboard users)
CLOAKPIPE_CLOUD_TOKEN= # Cloud dashboard token (app.cloakpipe.co)CloakPipe uses YAML policy files to configure detection behavior per compliance framework:
# policies/dpdp.yaml — India Digital Personal Data Protection Act
name: "DPDP Act 2023"
version: "1.0"
description: "Policy for India's Digital Personal Data Protection Act"
entities:
# Always detect and mask these
required:
- aadhaar_number
- pan_card
- upi_id
- person_name
- phone_number_in
- email_address
- date_of_birth
- address
- bank_account_in
- gstin
# Detect but warn (don't mask by default)
advisory:
- organization
- medical_term
- ip_address
# Skip these
disabled:
- ssn # US-only
- passport_us # US-only
masking:
strategy: "deterministic" # deterministic | random | hash
format: "{TYPE}_{ID}" # e.g., PERSON_042
session_scope: true # Same entity → same token within session
logging:
log_detections: true
log_masked_prompts: false # Never log original PII
export_format: "json" # json | csvPre-built policies included: dpdp.yaml, gdpr.yaml, hipaa.yaml, pci-dss.yaml, minimal.yaml
CloakPipe is built as a modular Rust workspace with 8 crates:
cloakpipe/
├── crates/
│ ├── cloakpipe-core # Detection, replacement, vault, rehydration
│ ├── cloakpipe-proxy # HTTP proxy server (axum, OpenAI-compatible)
│ ├── cloakpipe-tree # CloakTree: vectorless LLM-driven retrieval
│ ├── cloakpipe-vector # ADCPE distance-preserving vector encryption
│ ├── cloakpipe-local # Fully local mode (candle-rs embeddings + LanceDB)
│ ├── cloakpipe-audit # Compliance logging and audit trails
│ ├── cloakpipe-mcp # MCP server (6 tools via rmcp)
│ └── cloakpipe-cli # CLI interface (scan, mask, serve, vault, session)
├── policies/
│ ├── dpdp.yaml
│ ├── gdpr.yaml
│ ├── hipaa.yaml
│ └── pci-dss.yaml
├── Cargo.toml
├── LICENSE
└── README.md
cloakpipe-cli
├── cloakpipe-proxy
│ ├── cloakpipe-core
│ ├── cloakpipe-tree
│ ├── cloakpipe-vector
│ └── cloakpipe-audit
└── cloakpipe-mcp
└── cloakpipe-core
Each crate is independently usable. If you only need PII detection in your Rust app without the proxy, depend on cloakpipe-core directly.
Tested on standard PII datasets (English + Indian PII) with 1,000 text samples.
| Tool | Language | Avg Latency | P99 Latency | Accuracy (F1) | Reversible |
|---|---|---|---|---|---|
| CloakPipe | Rust | 3.2ms | 4.8ms | 0.91 | ✅ |
| Presidio | Python | 87ms | 142ms | 0.84 | ❌ |
| LLMGuard | Python | 112ms | 198ms | 0.82 | ❌ |
| Regex-only | Any | 0.5ms | 0.8ms | 0.61 | ❌ |
CloakPipe is 27x faster than Presidio while maintaining higher accuracy — because the ONNX model runs on optimized Rust runtime, not Python's GIL-constrained spaCy pipeline.
Need analytics, audit trails, or team features? CloakPipe Cloud adds a dashboard on top of the open-source proxy.
The proxy always runs on your infra. PII never leaves your network. Only anonymized telemetry (entity counts, latency metrics) goes to the dashboard.
| Feature | OSS (Free) | Cloud Pro ($99/mo) | Cloud Business ($499/mo) |
|---|---|---|---|
| Core proxy + detection | ✅ | ✅ | ✅ |
| Encrypted vault | ✅ | ✅ | ✅ |
| Policy templates | ✅ | ✅ | ✅ |
| India PII (Aadhaar, PAN, UPI) | ✅ | ✅ | ✅ |
| Dashboard + analytics | — | ✅ | ✅ |
| Audit trail export | — | ✅ | ✅ |
| Compliance reports | — | ✅ | ✅ |
| Privacy Chat UI | — | ✅ | ✅ |
| Multi-user | — | Up to 10 | Unlimited |
| RBAC + SSO | — | — | ✅ |
| Custom entity types | — | — | ✅ |
| Webhook alerts | — | — | ✅ |
| Kubernetes Helm chart | — | — | ✅ |
| MCP Server (6 tools) | — | — | ✅ |
| Support | Community | Priority |
version: '3.8'
services:
cloakpipe:
image: ghcr.io/cloakpipe/cloakpipe:latest
ports:
- "3100:3100"
environment:
- CLOAKPIPE_UPSTREAM_URL=https://api.openai.com
- CLOAKPIPE_POLICY=policies/dpdp.yaml
- CLOAKPIPE_LOG_LEVEL=info
volumes:
- cloakpipe-vault:/data/vault
restart: unless-stopped
volumes:
cloakpipe-vault:[Unit]
Description=CloakPipe LLM Privacy Proxy
After=network.target
[Service]
Type=simple
ExecStart=/usr/local/bin/cloakpipe serve --port 3100
Restart=always
Environment=CLOAKPIPE_UPSTREAM_URL=https://api.openai.com
[Install]
WantedBy=multi-user.targetWe welcome contributions. See CONTRIBUTING.md for guidelines.
Good first issues:
- Add new regex pattern for a PII type
- Improve NER accuracy on Indian names
- Add integration example (Haystack, LlamaIndex, etc.)
- Write documentation for a use case
Development setup:
git clone https://github.com/rohansx/cloakpipe.git
cd cloakpipe
cargo build
cargo test
cargo run -p cloakpipe-cli -- serve --port 3100- Core proxy with PII detection and masking
- AES-256-GCM encrypted vault
- Regex + ONNX NER detection pipeline
- Jaro-Winkler fuzzy entity resolution
- India PII support (Aadhaar, PAN, UPI, GSTIN)
- CloakTree: vectorless LLM-driven retrieval
- ADCPE distance-preserving vector encryption
- Industry profiles (legal, healthcare, fintech)
- MCP server (6 tools)
- Session-aware pseudonymization + coreference resolution
- Anthropic API native format support
- Multi-language NER (Hindi, Marathi, Tamil)
- WebSocket proxy mode
- Custom entity type plugins (WASM)
- TEE support (AWS Nitro Enclaves)
CloakPipe is security-focused software. If you find a vulnerability, please report it responsibly:
Email: security@cloakpipe.co
Do not file a public GitHub issue for security vulnerabilities.
Apache-2.0. See LICENSE.
The CloakPipe Cloud dashboard and enterprise features are proprietary (BUSL-1.1).