A production-grade, asynchronous Load Balancer built in Rust using Tokio and Hyper 1.0. Designed to demonstrate advanced SRE concepts including Consistent Hashing, Graceful Shutdown, Health-Based Failover, and Real-time Observability.
- ⚡ Hyper 1.0 & Tokio: Built on Rust’s modern async ecosystem (
hyper,hyper-util, Tokio) for non-blocking I/O and high concurrency. - 🔄 Consistent Hashing: Uses a virtual-node ring algorithm to minimize cache miss impact during scaling events (unlike simple Round Robin).
- 🛡️ Token Bucket Rate Limiter: Protects backends from DDoS and "noisy neighbor" attacks by enforcing strict RPS limits per client IP.
- 💓 Active Health Monitoring: Background task actively probes backend availability and automatically ejects unhealthy nodes in <3 seconds.
- 🛑 Graceful Shutdown: Implements "Zero Downtime" deployments. Catches
SIGINT, stops accepting new connections, and waits for active requests to drain before exiting. - 📊 Observability Stack: Native integration with Prometheus for metrics (Requests, Latency, Errors) and Grafana for visualization.
graph LR
Client[Client Traffic] -->|Port 3000| LB[RustyLB Load Balancer]
subgraph "RustyLB Internals"
LB --> RateLimit[Token Bucket Limiter]
RateLimit --> HashRing[Consistent Hash Ring]
HashRing -->|Select Node| Forward[Hyper 1.0 Proxy]
end
subgraph "Backend Services"
Forward -->|HTTP| S1[Service 8081]
Forward -->|HTTP| S2[Service 8082]
Forward -->|HTTP| S3[Service 8083]
end
subgraph "Observability"
Prometheus -->|Scrape /metrics| LB
Grafana -->|Query| Prometheus
end
Health[Health Monitor] -.->|Probe| S1
Health -.->|Probe| S2
Health -.->|Probe| S3
Diagram renders natively on GitHub.
- Consistent Hashing vs. Round Robin: I chose Consistent Hashing to ensure cache locality. In a distributed system, if one node dies, Round Robin would reshuffle all keys. Consistent Hashing only reshuffles
1/Nkeys, preventing cache stampedes. - Passive Circuit Breaking: Instead of a complex state machine (Open/Half-Open), I implemented a passive system where the Health Monitor acts as the source of truth for the Hash Ring. This simplifies the logic while maintaining resilience.
- Labeled Metrics: Instead of structured logging for every request (which is expensive), I used labeled Prometheus metrics to track latency and error rates per-backend in real-time.
- Rust (Cargo)
- Docker (for Grafana/Prometheus)
Start Prometheus and Grafana automatically:
docker-compose up -d
- Grafana: http://localhost:3001 (User: admin / Pass: admin)
- Prometheus: http://localhost:9091
# Run with info logs enabled
RUST_LOG=info cargo run --bin lb
You can use curl to send traffic. The LB listens on port 3000.
curl [http://127.0.0.1:3000](http://127.0.0.1:3000)
To prove the Graceful Shutdown capability (ensuring no users are disconnected during a deployment):
Note: The slow backend used in this test is an external test harness and is intentionally not part of this repository.
(e.g., any HTTP server that sleeps for several seconds before responding)
curl http://127.0.0.1:3000Immediately hit Ctrl+C in the Rust terminal.
The load balancer will log:
🛑 Graceful shutdown...
It will wait for the in-flight request to finish, and the client will successfully receive:
I survived the shutdown! 🎉
The Load Balancer exposes standard SRE metrics at /metrics:
| Metric Name | Type | Description |
|---|---|---|
requests_total |
Counter | Total requests routed per backend |
requests_dropped_total |
Counter | Requests blocked by Rate Limiter |
active_connections |
Gauge | Current in-flight requests |
request_duration_seconds |
Histogram | Request latency distribution (P50–P99 via PromQL) |
This project focuses on infrastructure correctness and observability, not feature breadth.
Intentionally out of scope:
- TLS termination
- HTTP/2 / gRPC
- Dynamic config reload
- Full circuit breaker state machines
These tradeoffs keep the codebase small, auditable, and focused on SRE fundamentals.