This document provides example charts and visualizations from the ThemisDB Performance Dashboard along with interpretation guidelines.
What it shows: Operations per second for Create, Read, Update, Delete operations over time across different branches.
Key Metrics:
- Write Throughput: Target > 40,000 ops/sec
- Read Throughput: Target > 100,000 ops/sec
- CRUD Mixed: Target > 45,000 ops/sec
Example Interpretation:
Main Branch (blue line): 45,234 ops/sec ✅ Above target
Develop Branch (green line): 43,112 ops/sec ⚠️ Slightly below
PR #1234 (orange line): 38,901 ops/sec ❌ Regression detected
What to look for:
- ✅ Stable or upward trends
⚠️ Sudden drops (investigate immediately)- ℹ️ Gradual decline over multiple days (technical debt accumulation)
What it shows: Response time distribution for queries and operations.
Thresholds:
- P50 (Median): < 2ms ideal
- P95: < 10ms target
- P99: < 50ms critical threshold
Example Data:
| Benchmark | P50 | P95 | P99 | Status |
|---|---|---|---|---|
| Point Read | 0.8ms | 2.1ms | 4.5ms | ✅ Excellent |
| Point Write | 1.2ms | 3.8ms | 8.2ms | ✅ Good |
| Range Query | 5.4ms | 18.2ms | 45.1ms | ✅ Acceptable |
| Complex Join | 12.3ms | 89.4ms | 156.7ms |
What to look for:
- P99/P50 ratio (should be < 10x)
- Sudden spikes in P99 (outliers or systemic issues)
- Gradual increase over releases (performance degradation)
What it shows: Percentage of failed operations over time.
Targets:
- Normal: < 0.1%
- Warning: 0.1% - 1%
- Critical: > 1%
Example Visualization:
Error Rate Over Time (7 days)
1.2% | * 🔴 Critical spike
1.0% | * *
0.8% | * *
0.5% | * * 🟡 Elevated
0.2% | * *
0.1% |* ***** 🟢 Normal
0.0% |________________*********___________
Mon Tue Wed Thu Fri Sat Sun
What to look for:
- Spikes correlating with deployments
- Patterns (time-of-day, day-of-week)
- Sustained elevated rates
What it shows: Side-by-side throughput comparison between branches.
Example Data:
Throughput Comparison (ops/sec)
Main: ████████████████████████ 45,234
Develop: ██████████████████████ 42,112 (-6.9%)
Feature: ████████████████ 35,890 (-20.7%) ⚠️
Decision Matrix:
| Difference | Action |
|---|---|
| < 5% | ✅ Acceptable, merge |
| 5-10% | |
| 10-20% | ❌ Block, investigate |
| > 20% | ❌ Block, critical regression |
What it shows: Performance evolution across releases.
Example Chart:
Release Performance Trend
50K | v1.4.1
| v1.4.0 *
45K | v1.3.2 * *
| v1.3.1 * *
40K | v1.3.0 * *
| * *
35K |*
|_________________________________
Jan Feb Mar Apr May
Legend: * = Release
What to look for:
- Consistent improvements (good)
- Flat trends (stagnation)
- Regressions between releases (immediate attention)
What it shows: Performance across different hardware configurations.
Example Data:
| Hardware | CRUD ops/sec | Latency P99 | Cost/hr | Efficiency |
|---|---|---|---|---|
| GitHub Actions (2 cores) | 25,000 | 15ms | $0.008 | 3.1M ops/$ |
| AWS c5.2xlarge (8 cores) | 65,000 | 8ms | $0.34 | 191K ops/$ |
| Bare Metal (32 cores) | 180,000 | 3ms | $2.50 | 72K ops/$ |
What to look for:
- Linear scaling with cores
- Cost efficiency for your use case
- Consistent P99 across hardware
What it shows: Tokens per second for LLM inference workloads.
Targets:
- 7B models: > 50 tokens/sec
- 13B models: > 25 tokens/sec
- 70B models: > 5 tokens/sec
Example Chart:
Token Generation Rate (tokens/sec)
60 | Mistral-7B *****
50 | ** **
40 | ** **
30 | Llama-13B * ***
20 | ** **
10 | Llama-70B** ***
0 |_________________________________
0 5 10 15 20 25 30 min
What to look for:
- Consistent generation rate (no throttling)
- First token latency < 200ms
- No memory leaks (rate stays stable)
What it shows: Performance of HNSW and other vector index types.
Benchmarks:
- HNSW: Approximate nearest neighbor search
- Flat: Brute-force exact search
- IVF: Inverted file index
Example Data:
| Index Type | Build Time | Search QPS | Recall@10 | Memory |
|---|---|---|---|---|
| HNSW (M=16, ef=200) | 45s | 5,400 | 0.97 | 2.1GB |
| Flat | 0s | 180 | 1.00 | 1.5GB |
| IVF (nlist=100) | 12s | 2,100 | 0.93 | 1.8GB |
What to look for:
- Recall vs. speed trade-off
- Memory usage under limits
- Search performance > 1000 QPS for production
What it shows: Number of detected regressions categorized by severity.
Example Dashboard Panel:
┌─────────────────────────────────┐
│ ❌ Critical: 0 │
│ ⚠️ Major: 2 │
│ ℹ️ Minor: 5 │
│ ✅ Improvements: 8 │
└─────────────────────────────────┘
Alert Rules:
- Critical (>20%): Page on-call
- Major (10-20%): Create ticket
- Minor (5-10%): Review in weekly meeting
What it shows: Benchmarks with largest performance changes in last hour/day.
Example Table:
| Benchmark | Previous | Current | Change | Trend |
|---|---|---|---|---|
| vector_search_100k | 4,200 qps | 5,100 qps | +21.4% | ✅ Improvement |
| write_batch_1000 | 38,000 | 32,000 | -15.8% | ❌ Regression |
| read_point_query | 125,000 | 128,000 | +2.4% | ✅ Stable |
| complex_join | 890 qps | 865 qps | -2.8% |
What to look for:
- Correlate with recent code changes
- Verify intentional vs. unintentional changes
- Document justified regressions
- Critical metrics at a glance
- Regression counts
- Overall health status
- Time range selector
- Main performance metrics
- Branch comparisons
- Trend lines
- Percentile distributions
- SLO compliance
- Outlier detection
- Error rates
- Hardware comparisons
- Specialized benchmarks (LLM, Vector)
- Top changes
- Detailed tables
- Investigation tools
-
Look at regression panel (top-left)
- 0 critical = ✅
- Any critical = 🚨 investigate immediately
-
Check throughput trends (center-left)
- Lines going up or flat = ✅
- Lines trending down =
⚠️
-
Verify P99 latency (center-right)
- Under threshold = ✅
- Spikes =
⚠️ investigate
-
Error rate (bottom-left)
- Near 0% = ✅
-
0.1% =
⚠️
-
Identify the regression
- Which benchmark?
- When did it start?
- How severe?
-
Find the cause
- Correlate with commits (use annotations)
- Check branch/PR
- Review code changes
-
Assess impact
- Production vs. benchmark environment
- User-facing operations?
- Critical path?
-
Plan remediation
- Revert?
- Optimize?
- Accept trade-off?
# Show rate of change
deriv(themisdb_benchmark_throughput_ops{benchmark="crud"}[1h])
# Percentage difference
(themisdb_benchmark_throughput_ops{branch="pr-1234"} /
themisdb_benchmark_throughput_ops{branch="main"} - 1) * 100
# Standard deviation from mean
abs(themisdb_benchmark_latency_ms -
avg_over_time(themisdb_benchmark_latency_ms[24h])) >
2 * stddev_over_time(themisdb_benchmark_latency_ms[24h])
# Group by hour
avg(themisdb_benchmark_throughput_ops) by (hour(time()))
-
Click "Add Panel" in Grafana
-
Select visualization type:
- Time series: Trends over time
- Stat: Single value KPIs
- Table: Detailed comparisons
- Bar chart: Categorical comparisons
-
Write PromQL query:
# Example: Average throughput by branch
avg(themisdb_benchmark_throughput_ops) by (branch)
-
Configure thresholds:
- Green: Above target
- Yellow: Warning range
- Red: Critical
-
Add to dashboard
# Screenshot
# Use Grafana's built-in share → snapshot feature
# Or use API:
curl -H "Authorization: Bearer YOUR_API_KEY" \
"http://localhost:3000/render/d-solo/themisdb-performance-dashboard/...?width=800&height=400" \
> chart.png# PDF Export (requires Grafana Enterprise or use puppeteer)
# Alternative: Screenshot all panels and assemble in document