Secure | Distributed | Verifiable | Private
Features | Architecture | Quick Start | Documentation | API
EnclaveML is a distributed system that enables secure ML inference across multiple heterogeneous machines while providing cryptographic guarantees that computations were performed correctly. Using zero-knowledge proofs (zkML), each worker node proves it executed the model faithfully—without revealing the underlying data or model weights.
┌──────────────┐ ┌──────────────────────────────────┐ ┌──────────────┐
│ Client │ ───▶ │ Aggregator + Verifier │ ◀─── │ Workers │
│ Request │ │ Splits • Coordinates • Proves │ │ CUDA/Metal │
└──────────────┘ └──────────────────────────────────┘ └──────────────┘
| Feature | Description |
|---|---|
| Distributed Inference | Split workloads across multiple GPU/CPU nodes based on capacity |
| Zero-Knowledge Proofs | Cryptographic verification without revealing data or weights |
| Multi-Backend | CUDA (NVIDIA), Metal (Apple Silicon), and CPU support |
| Privacy First | Input data never leaves your local network |
| Fault Tolerant | Invalid proofs rejected, system continues with valid results |
| Flexible Aggregation | Multiple result combination strategies for different use cases |
┌─────────────────┐
│ Client │
│ POST /inference │
└────────┬────────┘
│
▼
┌──────────────────────────────────────────────────────────────────────────────┐
│ AGGREGATOR NODE │
│ ┌────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │
│ │ REST API │ │ Coordinator │ │ Verifier │ │
│ │ Port 8080 │ │ Data Splitter │ │ Proof Checker │ │
│ └────────────────┘ └─────────────────┘ └─────────────────┘ │
└────────────────┬─────────────┬─────────────┬─────────────┬───────────────────┘
│ │ │ │
┌────────▼───┐ ┌──────▼────┐ ┌─────▼─────┐ ┌────▼───────┐
│ Worker 1 │ │ Worker 2 │ │ Worker 3 │ │ Worker 4 │
│ ━━━━━━━━━━ │ │ ━━━━━━━━━ │ │ ━━━━━━━━━ │ │ ━━━━━━━━━━ │
│ CUDA 48GB │ │ CUDA 12GB │ │ CUDA 6GB │ │ Metal 16GB │
│ 60% │ │ 15% │ │ 8% │ │ 17% │
└────────────┘ └───────────┘ └───────────┘ └────────────┘
1. Client ─────────▶ POST /inference with data
│
2. Aggregator ─────▶ Split data by worker capacity
│
3. Workers ────────▶ Process chunks in parallel
├─ Run inference on GPU/CPU
├─ Generate ZK proof
└─ Return result + proof
│
4. Aggregator ─────▶ Verify all proofs
├─ Reject invalid proofs
└─ Combine valid results
│
5. Client ◀──────── Receive verified result
EnclaveML orchestrates ML inference across multiple machines, each contributing GPU power while maintaining data privacy and computation integrity:
YOUR DATA VERIFIED RESULTS
│ ▲
▼ │
┌──────────────────────────────────────────────────────────────┐
│ AGGREGATOR (Mac Mini) │
│ 1. Receives your inference request │
│ 2. Splits data proportionally by worker GPU memory │
│ 3. Dispatches chunks to workers in parallel │
│ 4. Collects results + cryptographic proofs │
│ 5. Verifies every proof (rejects any failures) │
│ 6. Combines valid results and returns to you │
└─────────────────┬────────────────────────────────────────────┘
│
┌─────────────┼─────────────┬─────────────┐
▼ ▼ ▼ ▼
┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐
│Worker 1│ │Worker 2│ │Worker 3│ │Worker 4│
│ 48GB │ │ 12GB │ │ 6GB │ │ 16GB │
│ 60% │ │ 15% │ │ 8% │ │ 17% │
└────────┘ └────────┘ └────────┘ └────────┘
│ │ │ │
└─────────────┴─────────────┴─────────────┘
│
Each worker independently:
• Runs ML inference on its chunk
• Generates ZK proof of correct execution
• Returns result + proof to aggregator
No worker sees another worker's data. Every result is cryptographically verified.
| Machine | IP | Role | What It Runs | Unique Config |
|---|---|---|---|---|
| Mac Mini | 10.0.0.1 | Aggregator + Worker | REST API, Coordinator, Metal Worker | Port 8080 (API), 50053 (worker) |
| PC 1 | 10.0.0.2 | Worker | CUDA inference + proofs | node_id: worker-48gb |
| PC 2 | 10.0.0.3 | Worker | CUDA inference + proofs | node_id: worker-12gb |
| PC 3 | 10.0.0.4 | Worker | CUDA inference + proofs | node_id: worker-6gb |
Each machine has unique:
- Static IP address
- Node ID (self-assigned in config)
- Memory capacity declaration
- Device type (cuda vs metal)
All machines share:
- Same EnclaveML binary (built with appropriate features)
- Same ML model (loaded at startup)
- Same network (10.0.0.0/24)
See doc/BOOK.md for complete step-by-step deployment instructions.
| Node | Role | GPU | RAM | OS | Backend |
|---|---|---|---|---|---|
| 1 | Worker | NVIDIA 48GB | 64GB+ | Ubuntu 22.04+ | CUDA |
| 2 | Worker | NVIDIA 12GB | 32GB+ | Ubuntu 22.04+ | CUDA |
| 3 | Worker | NVIDIA 6GB | 16GB+ | Ubuntu 22.04+ | CUDA |
| 4 | Aggregator + Worker | Apple M4 16GB | 16GB | macOS 14+ | Metal |
┌─────────────────────────────────────────────────────────────────┐
│ Local Network 10.0.0.0/24 │
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Worker 1 │ │ Worker 2 │ │ Worker 3 │ │
│ │ 10.0.0.2 │ │ 10.0.0.3 │ │ 10.0.0.4 │ │
│ │ :50052 │ │ :50052 │ │ :50052 │ │
│ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │
│ │ │ │ │
│ └────────────┬────┴────────┬────────┘ │
│ │ │ │
│ ┌───────▼─────────────▼───────┐ │
│ │ Aggregator │ │
│ │ 10.0.0.1 │ │
│ │ REST :8080 │ gRPC :50051 │ │
│ └─────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
| Port | Protocol | Service |
|---|---|---|
| 8080 | HTTP | Aggregator REST API |
| 50051 | gRPC | Aggregator coordination |
| 50052 | gRPC | Worker nodes |
All machines:
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
source $HOME/.cargo/envNVIDIA machines:
sudo apt update && sudo apt install nvidia-cuda-toolkit
nvcc --version # VerifyMac Mini:
xcode-select --installgit clone https://github.com/yourusername/EnclaveML.git
cd EnclaveML
# CUDA machines (Nodes 1, 2, 3)
cargo build --release --features cuda
# Mac Mini (Node 4)
cargo build --release --features metal
# CPU-only testing
cargo build --releaseStart Aggregator (Mac Mini):
enclaveml start-aggregator --config examples/aggregator-config.yamlStart Workers (each NVIDIA machine):
enclaveml start-node --config examples/worker-48gb-config.yaml # Node 1
enclaveml start-node --config examples/worker-12gb-config.yaml # Node 2
enclaveml start-node --config examples/worker-6gb-config.yaml # Node 3# Submit request
enclaveml inference --input examples/sample-input.json --output result.json
# Check status
enclaveml status --detailedAggregator Config
aggregator:
port: 8080
model_hash: null
strict_mode: true
request_timeout_secs: 30
split_strategy: by_capacity # equal | by_capacity | redundant
combine_strategy: average # average | weighted_average | majority_vote | soft_vote | medianWorker Config
worker:
node_id: worker-1
port: 50052
aggregator_address: "10.0.0.1:8080"
device: cuda # cpu | cuda | metal
memory_gb: 48| Method | Endpoint | Description |
|---|---|---|
GET |
/health |
Health check |
GET |
/status |
Cluster status |
GET |
/workers |
List workers |
POST |
/workers |
Register worker |
DELETE |
/workers/:id |
Remove worker |
POST |
/inference |
Run inference |
GET |
/model/hash |
Get model hash |
curl -X POST http://10.0.0.1:8080/inference \
-H "Content-Type: application/json" \
-d '{
"data": [0.1, 0.2, 0.3, ...],
"sample_size": 64,
"combine_strategy": "average"
}'{
"result": [0.1, 0.2, 0.7],
"proofs_verified": 3,
"total_nodes": 3,
"inference_time_ms": 150,
"verified": true
}EnclaveML uses zkML to verify computation integrity without exposing:
- Model weights
- Input data (beyond verification needs)
- Intermediate values
┌─────────────────────────────────────────┐
│ InferenceProof │
├─────────────────────────────────────────┤
│ ProofCommitment │
│ ├─ model_hash [u8; 32] │
│ ├─ input_hash [u8; 32] │
│ ├─ output_hash [u8; 32] │
│ └─ node_id String │
├─────────────────────────────────────────┤
│ output: Vec<f32> │
├─────────────────────────────────────────┤
│ proof_bytes │
│ ├─ Magic: "ZKML" │
│ ├─ Version: 0x01 │
│ ├─ Commitment proof │
│ ├─ Computation proof │
│ └─ Consistency proof │
└─────────────────────────────────────────┘
| Strategy | Best For | Description |
|---|---|---|
average |
Regression | Mean of all outputs |
weighted_average |
Mixed capacity | Weighted by worker memory |
majority_vote |
Classification | Hard voting on predictions |
soft_vote |
Ensemble | Averaged probabilities |
median |
Outlier resistance | Median per dimension |
EnclaveML/
├── core/ # Core types: DataChunk, Device, NodeInfo, Proof
├── model/ # ML model with Burn framework
├── zkml/ # Zero-knowledge proof generation & verification
├── node/ # Worker node: gRPC server, device detection
├── aggregator/ # Coordinator, verifier, combiner, REST API
├── cli/ # Command-line interface
└── examples/ # Configuration templates
# Run all tests
cargo test --workspace
# Run specific crate tests
cargo test -p enclaveml-core
# Build documentation
cargo doc --workspace --openWorker not connecting
ping 10.0.0.1 # Check network
sudo ufw status # Check firewall
curl http://10.0.0.1:8080/health # Check aggregatorCUDA errors
nvidia-smi # Check driver
nvcc --version # Check CUDA
nvidia-smi --query-gpu=memory.free --format=csv # Check memoryMetal errors (macOS)
system_profiler SPDisplaysDataType | grep Metal| GPU Memory | Recommended Batch Size |
|---|---|
| 48GB | 256-512 |
| 12GB | 64-128 |
| 6GB | 32-64 |
| 10GB (M4) | 64-96 |
| Document | Description |
|---|---|
| ARCHITECTURE.md | System design & protocols |
| DEPLOYMENT.md | Step-by-step cluster setup |