Skip to content

report: Combined verification + architecture analysis (#5441, #5442, #5443)#5578

Open
beastoin wants to merge 1 commit intomainfrom
report/verification-architecture-5441-5442-5443-v2
Open

report: Combined verification + architecture analysis (#5441, #5442, #5443)#5578
beastoin wants to merge 1 commit intomainfrom
report/verification-architecture-5441-5442-5443-v2

Conversation

@beastoin
Copy link
Collaborator

Summary

Combined verification report for PRs #5441, #5442, #5443 with architecture analysis of recent shifts in the Omi backend. All 3 PRs are now merged. This PR is a report-only artifact — no code changes.

  • 40/40 tests pass (combined suite)
  • Codex audit: 6 gaps identified, 5 resolved, 1 accepted risk
  • E2E: 30s + 5min core flow on Pixel 7a with Omi BLE device — PASS

Section 1: Verification Report

PR #5441 — People/Conversations 500s Fix (yuki)

Unit Tests: 14/14 pass

Live API Verification (dev Firestore, real data):

  • GET /v1/users/people → 200 (fresh user + existing data)
  • GET /v1/conversations?limit=50 → 200, 141KB response (previously 57MB with embedded photos)
  • GET /v1/conversations/{id} → 200 (individual conversation detail)
  • 5 synthetic legacy Firestore docs (missing created_at/updated_at) → all return 200

Key Changes Verified:

  • get_conversations_without_photos() — new function that skips photo subcollection loads for list endpoints
  • get_people() — injects doc ID via data.setdefault('id', person.id) for legacy doc compatibility
  • get_people_by_ids() — uses db.get_all(doc_refs) batch fetch instead of where("id","in",...)

Codex Audit: 6 gaps identified, 5 resolved with evidence, 1 accepted risk (cross-pod 30s TTL for non-critical paths)


PR #5442 — Multipart 401 Retry (kenji)

Unit Tests: 7/7 pass

Emulator Verification:

  • APK built with JDK 21 (Gradle toolchain requirement)
  • Installed and launched on Android emulator
  • App launched without crash

Code Review — 3 retry handlers verified in app/lib/backend/http/shared.dart:

  1. makeApiCall() — standard HTTP 401 retry
  2. makeMultipartApiCall() — multipart with full request rebuild (streams can't be reused)
  3. makeMultipartApiCallStreaming() — streaming multipart with identical rebuild pattern

Key Decision: Force sign-out after second consecutive 401 failure — prevents infinite retry loops


PR #5443 — Firestore Read Ops Cache (hiro)

Unit Tests: 19/19 pass

Latency Improvements (measured with local backend):

Endpoint Before After Reduction
Mentor notification 855ms 0ms (cache hit) 100%
Apps list 3,561ms 301ms 91.5%
Mentor API 183ms 1.8ms 99%

Cache Architecture Verified:

  • Write-through invalidation confirmed (PATCH → GET cycle returns fresh data)
  • Cross-pod invalidation via Redis pub/sub — tested with real Redis integration tests
  • Memory cache: 30s TTL, LRU eviction, singleflight deduplication
  • Redis cache: 10-30min TTL depending on endpoint

Key Decision: 30s memory TTL chosen to balance freshness vs Firestore cost (mentor notification fires every 1s per active stream)


Combined Results

PR Tests Live API E2E Verdict
#5441 14/14 PASS 200 OK PASS PASS
#5442 7/7 PASS N/A (client) PASS PASS
#5443 19/19 PASS Cache verified PASS PASS
Combined 40/40 PASS All 200 PASS PASS
  • Clean merge, no conflicts between the 3 PRs
  • Cross-PR interference: none detected
  • Auth bypass check: CLEAN (no LOCAL_DEVELOPMENT flags active)

Section 2: Architecture Insights

1. New: Two-tier caching layer (PR #5443, PR #5378)

  • Memory cache (30s TTL, LRU, singleflight) sits in front of Redis (10-30min) and Firestore
  • New files: database/cache.py, database/cache_manager.py
  • Cross-pod invalidation via Redis pub/sub — when one pod writes, all pods evict stale cache
  • Key decision: 30s TTL chosen to balance freshness vs Firestore cost. Mentor notification fires every 1s per active stream — without caching, each active user generates ~60 Firestore reads/min just for notification checks

2. New: Photo-less list endpoints (PR #5441)

  • get_conversations_without_photos() skips photo subcollection loads for list endpoints
  • 400x payload reduction (57MB → 141KB for 50 conversations)
  • Key decision: Separate function rather than conditional flag — cleaner API, no risk of breaking existing callers that depend on photo data

3. New: Multipart 401 resilience (PR #5442)

  • Full HTTP request rebuild on token refresh (multipart streams are single-use — can't be rewound)
  • Consistent retry pattern across all 3 HTTP call types in the Flutter client
  • Key decision: Force sign-out after second consecutive 401 failure — prevents infinite retry storm on truly expired/revoked tokens

4. Shifted: Firestore cost model (PR #5378, deployed + verified)

  • Hot-path reads reduced 18-29% sustained (measured over 24h post-deploy)
  • LOOKUP reads down ~21% at T+20h
  • Key decision: Targeted field projections + caching for high-frequency reads. The mentor notification endpoint (every 1s per stream) was the single largest Firestore cost driver

5. Shifted: Database module scope

  • database/ now includes caching infrastructure (cache.py, cache_manager.py), not just Firestore/Redis connections
  • Module hierarchy still holds: database/utils/routers/main.py
  • Cache manager is initialized at module level in database/cache_manager.py and imported by utils/routers

6. Stable: Service map unchanged

  • backend → pusher → diarizer → deepgram → vad pipeline untouched
  • agent-proxy unchanged
  • All changes are within backend internals (data access layer + Flutter client HTTP)
  • No new services, no new inter-service communication paths

E2E Evidence

  • 30s core flow (Pixel 7a + Omi BLE): Recording → Deepgram STT → live transcript in app — PASS
  • 5min sustained recording: Continuous audio → memory creation → summary generation — PASS
  • Screenshots uploaded to GCS: gs://omi-pr-assets/pr-5445/core-flow-e2e/

Overall Verdict: PASS

All 3 PRs verified independently and combined. Now merged to main.

🤖 Generated with Claude Code

…5443)

Verification results and architecture insights for the combined
testing of PRs #5441 (people/conversations 500s), #5442 (multipart 401 retry),
and #5443 (Firestore read ops cache). All PRs now merged.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant