Performance by christopherkarani · Pull Request #9 · christopherkarani/Zoni

christopherkarani · 2026-01-08T00:16:51Z

No description provided.

## New Features ### ParentChildChunker - Hierarchical chunking strategy for 4K context windows - Creates large parent chunks (2000 chars) with small children (400 chars) - Maintains bidirectional parent-child relationships in metadata - Configurable overlap, separators, and output options ### ParentChildRetriever - Searches child embeddings but returns parent chunks for context - Score aggregation strategies: max, average, sum - Includes ParentLookup protocol and VectorStoreParentLookup actor - LRU caching for efficient parent lookups ### HuggingFaceEmbedding - HuggingFace Inference API integration - Supports MiniLM (384d), BGE (768d/1024d), E5 (1024d), Jina (768d) - Configurable wait_for_model for cold starts - Rate limit handling with retry-after support ### MistralEmbedding - Mistral AI embedding API (mistral-embed, 1024d) - Index-sorted response handling for order consistency - Rate limiter factory method added ## Tests - Comprehensive ParentChildChunkerTests covering: - Basic chunking, relationship integrity, overlap validation - Metadata completeness, edge cases, configuration 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

P1 Features (Document Preprocessing): - VisionOCRProcessor: Actor-based OCR using Apple Vision framework - extractText(from: URL/CGImage) for images - extractTextFromPDF(at:) for multi-page PDFs - Configurable languages and recognition levels - TableExtractor: Extracts tables from document images - ExtractedTable struct with markdown and CSV output - Uses VNDetectRectanglesRequest + VNRecognizeTextRequest - Groups text by Y-coordinate into rows P2 Features (Advanced Retrieval & Evaluation): - ChunkGraph: Graph data structure for chunk relationships - EdgeType: sequential, semantic, reference - Builds bidirectional edges for adjacent chunks - Creates semantic edges for high-similarity chunks (threshold: 0.8) - GraphRetriever: BFS-based graph traversal retrieval - Combines vector search with graph expansion - Configurable hops (default: 2) and edge threshold (0.7) - Score propagation with decay factor - RAGEvaluator: Evaluation framework for RAG pipelines - RetrievalMetrics: precision, recall, F1, MRR, NDCG - GenerationMetrics: faithfulness, relevance, coherence - Optional LLM judge for faithfulness scoring All implementations use Swift 6.0 actor-based concurrency with Sendable conformance. Build verified with zero errors. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Code Review Fixes (6 warnings resolved): 1. Task Cancellation in ParentChildRetriever - Added Task.checkCancellation() before embedding, after embedding, and during parent fetching loop - Prevents wasted resources when caller cancels mid-operation 2. O(n²) Complexity Limit in ChunkGraph - Added 100,000 comparison limit to buildSemanticEdges() - Documented O(n²) complexity in method documentation - Debug logging when limit is hit, suggesting k-NN alternatives 3. BFS Cancellation in GraphRetriever - Added Task.checkCancellation() at start of each BFS hop - Prevents long graph traversals from blocking cancellation 4. Error Logging in ParentChildRetriever - Added DEBUG-only logging when parent fetch fails - Improves production debugging without performance impact 5. Overflow Protection in ParentChildChunker - Used addingReportingOverflow() for offset calculation - Prevents integer overflow with very large (>2GB) documents 6. Documentation for nonisolated(unsafe) in TableExtractor - Added SAFETY comment explaining why VNRectangleObservation is safe to transfer despite missing Sendable conformance All fixes maintain Swift 6.0 strict concurrency compliance. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Code Review Fixes (3 Critical + 4 Warnings): **Critical Issues:** 1. TableExtractor Vision callback - Replaced `nonisolated(unsafe)` with Sendable `DetectedRegion` struct to safely cross actor boundaries 2. SQLiteVectorStore schema init - Added NSLock protection to prevent potential race condition in `createSchemaIfNeeded()` 3. ChunkGraph O(n²) truncation - Added `truncationWarning` property with detailed progress tracking (% processed, chunks affected) **Warnings:** 1. OllamaEmbedding batch - Added Task.checkCancellation() between iterations 2. ParentChildChunker overflow - Now throws ZoniError.chunkingFailed instead of using Int.max fallback 3. RAGEvaluator regex - Added proper error handling and DEBUG logging for regex failures 4. SQLiteVectorStore dimensions - Fixed validation to use actualDimensions when available 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

The truncationWarning property was private(set), preventing callers from checking if semantic edge building was truncated. Changed to public private(set) with usage example in documentation. This ensures callers can detect and respond to partial graph construction. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Implements Phase 1 of GPU optimization plan with 10-432x speedups on large datasets (>10k vectors). Adds adaptive CPU/GPU backend selection, 6 Metal compute kernels (cosine similarity, pairwise matrix, MMR scoring), comprehensive test coverage, and automatic fallback for filtered searches. Fixes lambda validation and candidateMultiplier bounds in MMR retriever for consistency. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

christopherkarani · 2026-01-08T00:17:07Z

@claude do a comprehensive code review

claude · 2026-01-08T00:17:21Z

- Added GPUVectorStore for Metal-accelerated retrieval in ZoniApple - Refactored SemanticChunker to use injectable SimilarityCalculator - Implemented MetalSimilarityCalculator for 7x chunking speedup - Added SemanticChunkerBenchmarks - Finalized cleanup of ZoniServer integrations (removed legacy ZoniVapor/ZoniHummingbird folders)

Both server framework integrations were behind #if compilation flags that were never activated — no SPM dependencies declared, no tests running, and no feature parity between them. Removes 20 integration files (~2.8k LOC), 6 stale audit docs, 25 auto-generated CLAUDE.md files, and an empty JobProtocol.swift duplicate. Updates README and doc comments to reflect the removal. Adds ZoniConduit integration layer and various test/source improvements from the performance branch. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

claude · 2026-02-19T19:20:01Z

PR Review: Performance (#9)

This is a substantial PR adding significant new features to the Zoni RAG framework. The overall code quality is high — excellent Swift 6 actor usage, comprehensive documentation, and solid architectural patterns. Below is a detailed review grouped by category.

Summary

Category	Assessment
Swift 6 Concurrency	Excellent — actors, `Sendable`, proper isolation throughout
Documentation	Outstanding — comprehensive DocC comments on all public APIs
Architecture	Well-designed protocols with clean separation of concerns
Security	Strong authentication; one critical release-build gap
Performance	A few significant issues to address
Test Coverage	Good breadth, but notable gaps in integration and edge cases

Bugs / Correctness Issues

Critical: `precondition` Disabled in Release Builds (SQL Injection Risk)

Sources/ZoniServer/VectorStore/Stores/PgVectorStore.swift uses precondition for table name validation. precondition is compiled out in release builds (-O). This means invalid table names can reach SQL construction in production. Replace with explicit runtime validation that throws:

guard !tableName.isEmpty,
      tableName.first?.isLetter == true,
      tableName.allSatisfy({ $0.isLetter || $0.isNumber || $0 == "_" }) else {
    throw ZoniError.configurationError("Invalid table name: \(tableName)")
}

Critical: PineconeStore Silent Metadata Loss

Sources/Zoni/VectorStore/Stores/PineconeStore.swift documents that nested dicts in custom metadata are converted to null — but there is no warning or error when this happens. A user storing nested metadata will silently lose data. At minimum, log a warning or throw an error when nested dicts are detected before conversion.

MMRRetriever: Mutable Actor Properties

Sources/Zoni/Retrieval/Retrievers/MMRRetriever.swift exposes public mutable var properties directly on the actor (lambda, candidateMultiplier). These should be configurable at init time or exposed through explicit setter methods. Making them freely mutable var on an actor makes the concurrency contract confusing for callers.

RAGEvaluator: Silent Production Failures

Score parsing failures in Sources/Zoni/Evaluation/RAGEvaluator.swift silently produce 0.0 default values in production (the log is debug-only, ~line 556). This should log a warning in all build configurations so silent mis-evaluation is surfaced.

VectorStoreParentLookup: O(n) LRU Cache Operations

Sources/Zoni/Retrieval/VectorStoreParentLookup.swift maintains access order with an array, resulting in O(n) firstIndex(of:) on every cache hit. For a lookup called frequently during retrieval, this accumulates. Consider OrderedDictionary from swift-collections or a dictionary + linked list.

SQLiteVectorStore: O(n·d) Search Unscalable

Sources/Zoni/VectorStore/Stores/SQLiteVectorStore.swift loads all rows into memory and computes cosine similarity serially. At 100K vectors × 1536 dims this is ~600MB heap + heavy CPU. The implementation is fine for development, but the public API should prominently document this scaling limit so users don't use SQLite in production with large corpora.

PineconeStore: Retry Logic Inconsistently Applied

Sources/Zoni/VectorStore/Stores/PineconeStore.swift wraps search in executeWithRetry but add(), delete(), and count() call the API directly without retry. Transient network failures during bulk ingest surface as hard errors.

SemanticChunker: Fragile Offset Calculation

Sources/Zoni/Chunking/Strategies/SemanticChunker.swift reconstructs character offsets by splitting on "." and doing prefix-based string matching (~lines 358-365). This breaks with abbreviations, decimal numbers, or ellipses. Track offsets directly during the chunking pass instead.

GraphRetriever: Hardcoded Decay Factor

Sources/Zoni/Retrieval/Retrievers/GraphRetriever.swift uses 1.0 / (hop + 2) as the score decay formula with no rationale. Either document why this formula was chosen or expose it as a configurable parameter (decayFunction: (Int) -> Float).

Performance Issues

GPUVectorStore: Full Buffer Rebuild on Every Mutation

Sources/ZoniApple/VectorStore/GPUVectorStore.swift calls rebuildFlatStorage() after every add/delete, iterating the entire store each time. For bulk ingest this is O(n²). A dirty flag with lazy rebuild (already partially set up) or incremental buffer appends for add-only operations would avoid this.

OCRProcessor: Sequential PDF Page Processing

Sources/ZoniApple/Preprocessing/OCRProcessor.swift processes pages in a serial for loop (~line 256). Vision requests are independent and can be parallelized with TaskGroup for 5-10x improvement on multi-page documents.

InMemoryVectorStore+Metal: Full Sort Instead of Partial Selection

Sources/ZoniApple/VectorStore/InMemoryVectorStore+Metal.swift (~line 332) acknowledges that a heap would be more efficient for top-K selection, but still does a full sort. For small K relative to N, use partialSort or a heap — O(n log k) vs O(n log n).

Security

Authentication: TenantManager is excellent — PBKDF2-SHA256 with 120K iterations, random salt, pepper support, and constant-time comparison. Well done.
Parameterized queries: PgVectorStore consistently uses PostgresBindings. LIKE patterns are properly escaped.
Connection string parsing: Passwords containing @ or : in PgVectorStore (~line 1309) may fail URL parsing silently. Add explicit validation.
API keys: Not leaked in any description or log calls — good.

Test Coverage Gaps

PineconeStore has no tests. Given the silent metadata loss issue and incomplete retry logic, this is a blocker before merging.

SQLiteVectorStore is missing tests for:

Dimension mismatch (second add with different vector dims)
Concurrent access correctness
Transaction rollback recovery

PerformanceTests print metrics but do not assert baselines. Add #expect(elapsed < threshold) assertions so CI catches regressions.

MetalComputeTests hardcode the 5000/10000 vector thresholds from ComputeBackend. These should reference the same constants to avoid drift.

Non-Code Observations

IMPLEMENTATION_TASK_BREAKDOWN.md contains absolute local paths (/Users/chriskarani/CodingProjects/zoni/...). These should be scrubbed before merging. Consider whether a 1,771-line planning document belongs in the repository at all vs. a GitHub Project/issue.

Agent model downgrade (opus → sonnet): Valid cost tradeoff, but security-auditor and penetration-tester agents doing less thorough work on complex tasks is worth documenting explicitly in the PR description.

Actionable Items

Priority	Item
Must Fix	Replace `precondition` with runtime-throwing validation in PgVectorStore
Must Fix	Add warning/error for PineconeStore nested metadata loss
Must Fix	Add PineconeStore tests before merging
Should Fix	Fix GPUVectorStore O(n²) rebuild on bulk insert
Should Fix	Fix O(n) LRU operations in VectorStoreParentLookup
Should Fix	Add retry logic to PineconeStore `add()`/`delete()`
Should Fix	Make RAGEvaluator score parsing failures visible in production
Should Fix	Remove local machine paths from IMPLEMENTATION_TASK_BREAKDOWN.md
Consider	Parallelize OCRProcessor PDF page extraction with TaskGroup
Consider	Add baseline assertions to PerformanceTests
Consider	Make GraphRetriever decay factor configurable
Consider	Document SQLiteVectorStore scaling limits prominently in the README

Overall this is a well-engineered PR with excellent Swift 6 concurrency patterns and documentation. The precondition/release-build SQL validation issue and missing PineconeStore tests are the two items I'd treat as blockers; the rest are good follow-up candidates.

christopherkarani and others added 7 commits January 7, 2026 03:31

chunking

6c95f65

christopherkarani and others added 2 commits January 9, 2026 03:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance#9

Performance#9
christopherkarani wants to merge 9 commits intomainfrom
performance

christopherkarani commented Jan 8, 2026

Uh oh!

christopherkarani commented Jan 8, 2026

Uh oh!

claude bot commented Jan 8, 2026 •

edited

Loading

Uh oh!

claude bot commented Feb 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

christopherkarani commented Jan 8, 2026

Uh oh!

christopherkarani commented Jan 8, 2026

Uh oh!

claude bot commented Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Comprehensive Code Review

Todo List

Uh oh!

claude bot commented Feb 19, 2026

PR Review: Performance (#9)

Summary

Bugs / Correctness Issues

Critical: precondition Disabled in Release Builds (SQL Injection Risk)

Critical: PineconeStore Silent Metadata Loss

MMRRetriever: Mutable Actor Properties

RAGEvaluator: Silent Production Failures

VectorStoreParentLookup: O(n) LRU Cache Operations

SQLiteVectorStore: O(n·d) Search Unscalable

PineconeStore: Retry Logic Inconsistently Applied

SemanticChunker: Fragile Offset Calculation

GraphRetriever: Hardcoded Decay Factor

Performance Issues

GPUVectorStore: Full Buffer Rebuild on Every Mutation

OCRProcessor: Sequential PDF Page Processing

InMemoryVectorStore+Metal: Full Sort Instead of Partial Selection

Security

Test Coverage Gaps

Non-Code Observations

Actionable Items

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

claude bot commented Jan 8, 2026 •

edited

Loading

Critical: `precondition` Disabled in Release Builds (SQL Injection Risk)