-
Functional goal: Verify a person’s identity remotely via:
- Live photo (liveness + anti-spoof)
- Identity document (detection & OCR for documents worldwide)
- Face matching (live photo ↔ document photo)
- Structural validation of document fields (MRZ, country codes, dates, formats, checksums…)
- Overall validity score + structured data extraction
-
Technical scope: Secure REST API, multi-tenant, multi-region, with observability, usage-based billing, quotas & rate limits, webhooks, audit logs, and GDPR-like compliance.
-
Usage modes:
- Standalone API calls (each module can run independently)
- Full workflow (configurable processing pipeline)
- Super-admin (platform operator): manages clients, plans, pricing, SLAs, API keys, quotas, logs.
- Client (tenant): holds API keys, configures webhooks, views dashboards, downloads logs.
- System: asynchronous services (Celery) for heavy jobs & webhook dispatching.
-
Liveness & Anti-spoof
- Detects real-time capture (blink/pose/gaze) & spoof attempts (screen, print, mask, rephoto).
- Output:
is_live(bool),spoof_type(enum),confidence(0–1),hints.
-
Document Detection & OCR
- Auto-detects type (passport, ID card, license, residence permit) + country.
- Extracts fields (NAME, GIVEN NAMES, DOB, document number, sex, nationality, dates, MRZ, etc.).
- Normalizes (ISO date, ISO-3166 country, doc type).
-
Document Validation
- Formal checks (MRZ checksum, field lengths, date consistency, coherence).
- Detects missing areas, alterations, bad cropping.
- Output:
document_valid(bool),checks[],confidence.
-
Face Match
- Compares live photo ↔ document face (or stored user photo).
- Output:
match(bool),similarity(0–1), customizable thresholds.
-
Full e-KYC Workflow
- Orchestration:
liveness → ocr → validate → face_match → scoring. - Customizable: order, thresholds, mandatory fields, rules.
- Orchestration:
-
KYA (Basic Profile) – Optional
- Lightweight checks (email MX, E.164 phone format, country/IP, name coherence).
- No AML in this version (can be added as paid module).
-
Overall Score (0–100), weighted as:
- Liveness (30) + Face-match (40) + Doc validation (25) + KYA (5)
-
Default thresholds (tenant-customizable):
- Accepted ≥ 80
- Manual review 60–79
- Rejected < 60
-
Explanations: Detailed reasons per factor (e.g., “MRZ checksum fail”, “similarity 0.62 < 0.75 threshold”).
- Backend: Django 4.x, DRF 3.x, Python 3.11+
- Workers: Celery + Redis/RabbitMQ
- Storage: PostgreSQL 14+ (logical multi-tenancy), S3-compatible media
- AI / Vision integrations: provider-agnostic connectors (OpenCV, InsightFace, TensorRT, Tesseract, Cloud OCR…)
- Observability: Prometheus metrics, OpenTelemetry tracing, ELK logs
- Security: API-Key + HMAC signature (timestamped, anti-replay), TLS 1.2+, encryption at rest
- PII encrypted at rest (AES-256) and in transit (TLS).
- Tenant-level data isolation via scoped API keys.
- Media retention configurable (e.g., 30 days) then secure deletion.
- Right to erasure, privacy by design, immutable audit log (WORM).
- Rate limits global and per endpoint (e.g., 60 req/min/key).
- Secret rotation and API key rollover supported.
- Upload policy: MIME-type + size checks, antivirus (ClamAV), EXIF stripping.
- Tenant: name, country, support email, plan, monthly quotas.
- API Keys:
api_key_id(public) +api_key_secret(never re-shown) + HMAC. - Quota: per module (e.g., 5k liveness / month).
- Billing: usage-based (metered events table).
- Plans: Free (sandbox), Pro, Enterprise (SLA, support, dedicated region).
- Soft limit handling: HTTP 429 + webhook for thresholds (80%, 100%).
Base URL (prod): https://api.example.com/v1/
Auth headers: X-API-KEY, X-API-TIMESTAMP, X-API-SIGN (HMAC SHA256 of body + timestamp)
POST /liveness/analyze
{
"image_live_base64": "<...>",
"hints": {"pose": true, "blink": true}
}200 Response
{
"is_live": true,
"spoof_type": "none|screen|paper|mask|unknown",
"confidence": 0.93,
"processing_ms": 428,
"usage": {"module": "liveness", "billed": true}
}POST /document/ocr
{
"image_front_base64": "<...>",
"image_back_base64": "<optional>",
"document_hint": "auto|passport|id_card|driver_license",
"country_hint": "auto|CIV|FRA|USA"
}200 Response
{
"detected": {"type": "passport", "country": "CIV", "confidence": 0.88},
"fields": {...},
"images": {"face_crop_base64": "<...>"},
"quality": {"sharpness": 0.81, "glare": 0.07}
}POST /document/validate
200 Response
{
"document_valid": true,
"checks": [{"name": "mrz_checksum", "status": "pass"}],
"confidence": 0.9
}POST /face/match
200 Response
{"match": true, "similarity": 0.84, "threshold": 0.75}POST /kyc/verify
Async execution with webhook callback.
Returns job ID and queued status.
- Tenant, ApiKey, Job, UsageEvent, RateLimit, MediaStore, AuditLog
Images stored in S3; database holds only metadata & encrypted URLs.
- Client →
/kyc/verify→ creates Job (queued) - Celery executes steps → stores result → sends webhook.
- SLA: 99.5% (Pro), 99.9% (Enterprise)
- Target latencies (p95): liveness <1200ms, OCR <1800ms, pipeline <3500ms
- Max upload: 10MB / image (JPG/PNG/WebP)
- Rate-limit: 60 rpm / 50k per day (default)
Request logs, app logs, audit logs, metrics (latency, error rates, quotas), tracing (OpenTelemetry).
Environments: dev, sandbox, prod
OpenAPI 3.1 (Swagger), synthetic test sets, load & security tests (OWASP ASVS).
Consistent JSON error format with error.code, message, details, trace_id.
spoof_type:none|screen|paper|mask|deepfake|unknowndocument_type:passport|id_card|driver_license|residence_permit|otherdecision:accepted|review|rejectedmodule:liveness|ocr|validate|face_match|workflow
| Module | Price (USD) |
|---|---|
| Liveness | 0.01–0.03 |
| OCR | 0.05–0.12 |
| Validation | 0.01 |
| Face-match | 0.02–0.05 |
| Workflow | sum of modules (discount tiers) |
Headers:
X-API-KEYX-API-TIMESTAMPX-API-SIGN = hex(hmac_sha256(secret, timestamp + '\n' + method + '\n' + path + '\n' + body_sha256))
Clock skew tolerance ±5 min, replay protection enabled.
Manage tenants, keys, plans, quotas, monitoring, logs, and billing.
Kubernetes (autoscaling), S3 (WORM), PostgreSQL HA, CI/CD (blue-green deploy), encrypted backups.
AML, biometric enrollment, advanced fraud (deepfake), SDKs (mobile), no-code rule builder.
Availability ≥ SLA Scalable horizontally (GPU nodes optional) End-to-end traceability (audit, telemetry) OpenAPI documentation + Postman examples.
- OpenAPI spec + Postman collection (HMAC-ready)
- Integration guide (webhooks, idempotency, best practices)
- Test datasets (images, expected outputs)
- Dashboards (usage, errors, invoices)
- Back-office tools (tenants, quotas, logs)
- SLA & support contract.
Fully operational workflow Independent module endpoints Customizable scoring Accurate billing & quotas Verified HMAC security Operational observability Data retention compliance.
