Skip to content

Conversation

@flyingrobots
Copy link
Member

@flyingrobots flyingrobots commented Feb 7, 2026

Summary

  • Compression: Optional gzip compression pipeline via compression: { algorithm: 'gzip' } on store(). Compression is applied before encryption. Decompression on restore() is automatic.
  • KDF support: Passphrase-based encryption using PBKDF2 or scrypt via deriveKey() and passphrase option on store()/restore(). KDF parameters stored in manifest for deterministic re-derivation. All three crypto adapters implement deriveKey().
  • Merkle tree manifests: Large manifests (chunk count exceeding merkleThreshold, default 1000) auto-split into sub-manifests. Root manifest uses version: 2 with subManifests references. readManifest() transparently reconstitutes v2 manifests. Full backward compat with v1.
  • 52 new unit tests across three new test suites (compression, KDF, Merkle). Total: 421 tests.
  • Updated docs: README (What's New), GUIDE (3 new sections), API reference (deriveKey, compression, passphrase options, Merkle fields).

Test plan

  • npm test — 421 unit tests pass
  • npx eslint . — lint clean
  • Pre-push hooks pass (lint + unit tests)
  • CI validates on Node/Bun/Deno
  • npx jsr publish --dry-run --allow-dirty — JSR validation
  • npm pack --dry-run — npm distribution check

Summary by CodeRabbit

  • New Features

    • gzip compression, passphrase-based encryption (PBKDF2/scrypt), Merkle manifests with automatic sub-manifests, and a deriveKey() API.
  • Improvements

    • Public APIs accept passphrase/KDF/compression options; configurable merkleThreshold; manifests now include version, compression, subManifests, and KDF metadata; transparent v1/v2 handling.
  • Bug Fixes

    • KDF/salt handling and options forwarding corrected; improved browser-safe error behavior.
  • Tests

    • Extensive unit coverage for compression, KDFs, Merkle manifests, edge cases, and encryption interactions.
  • Documentation

    • README, guide, changelog, and roadmap updated for v2.0.0.

Add gzip compression pipeline, passphrase-based encryption via PBKDF2/scrypt,
and Merkle tree manifests for large files. Includes 52 new unit tests, updated
API reference, guide, and README documentation.
@coderabbitai
Copy link

coderabbitai bot commented Feb 7, 2026

Warning

Rate limit exceeded

@flyingrobots has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 10 minutes and 11 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📝 Walkthrough

Walkthrough

Adds v2.0.0: gzip compression for stored streams, passphrase-based key derivation (PBKDF2/scrypt) with stored KDF params and deriveKey APIs, and Merkle-tree manifests that split/aggregate large chunk lists; updates schemas, ports/adapters, public APIs, docs, and extensive tests.

Changes

Cohort / File(s) Summary
Release & Docs
CHANGELOG.md, README.md, GUIDE.md, ROADMAP.md, docs/API.md
Add v2.0.0 release notes and new documentation for Compression, Passphrase Encryption (KDF), and Merkle Manifests; reorder guide, update examples and API docs.
Top-level API & Metadata
index.js, index.d.ts, package.json, jsr.json
Add merkleThreshold wiring; expose deriveKey on public surface; extend store/storeFile/restore/restoreFile signatures with passphrase, kdfOptions, and compression; bump version to 2.0.0.
Core Service Logic
src/domain/services/CasService.js, src/domain/services/CasService.d.ts
Implement merkleThreshold and Merkle tree creation/reading; add gzip compress/decompress stream paths; add passphrase-based derive/resolve flows and public deriveKey; update store/restore signatures and error wrapping for sub-manifests.
Value Objects & Schemas
src/domain/value-objects/Manifest.js, src/domain/value-objects/Manifest.d.ts, src/domain/schemas/ManifestSchema.js, src/domain/schemas/ManifestSchema.d.ts
Add version, compression, subManifests to Manifest; add KDF/Compression/SubManifest schemas/types and include optional encryption.kdf.
Crypto Ports & Adapters
src/ports/CryptoPort.js, src/infrastructure/adapters/NodeCryptoAdapter.js, src/infrastructure/adapters/WebCryptoAdapter.js, src/infrastructure/adapters/BunCryptoAdapter.js
Add deriveKey to CryptoPort and adapters (pbkdf2 with SHA‑512 and scrypt); unified return shape { key, salt, params }.
Public API Docs
docs/API.md
Document merkleThreshold, deriveKey on CasService/CryptoPort, updated method signatures and manifest/encryption metadata shapes.
Tests
test/unit/domain/services/CasService.compression.test.js, test/unit/domain/services/CasService.kdf.test.js, test/unit/domain/services/CasService.merkle.test.js, test/unit/domain/services/CasService.errors.test.js
Add extensive unit tests covering compression round-trips, deriveKey (pbkdf2/scrypt), passphrase store/restore, Merkle splitting/reading and threshold boundaries, and error cases.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client
    participant CasService as CasService
    participant CryptoPort as CryptoPort
    participant Compression as Compression
    participant Persistence as Persistence

    Client->>CasService: store({ source, passphrase?, compression?, ... })
    alt passphrase provided
        CasService->>CryptoPort: deriveKey({ passphrase, ... })
        CryptoPort-->>CasService: { key, salt, params }
    end

    alt compression enabled
        CasService->>Compression: _compressStream(source)
        Compression-->>CasService: compressed stream
    else
        CasService->>CasService: use source stream
    end

    CasService->>CryptoPort: encrypt(stream, key?)
    CryptoPort-->>CasService: encrypted chunks
    alt chunks > merkleThreshold
        CasService->>CasService: _createMerkleTree(chunks)
        CasService->>Persistence: writeBlob(subManifest...), writeTree(rootManifest)
    else
        CasService->>Persistence: writeTree(manifest)
    end
    Persistence-->>CasService: manifest (with version/compression/subManifests/kdf)
    CasService-->>Client: Manifest
Loading
sequenceDiagram
    participant Client as Client
    participant CasService as CasService
    participant Persistence as Persistence
    participant CryptoPort as CryptoPort
    participant Decompression as Decompression

    Client->>CasService: restore({ manifest, passphrase? })
    alt manifest.subManifests present
        CasService->>Persistence: readBlob(subManifest1)
        Persistence-->>CasService: subManifest1
        CasService->>Persistence: readBlob(subManifestN)
        Persistence-->>CasService: subManifestN
        CasService->>CasService: merge sub-manifests -> full chunk list
    end

    alt manifest.encryption.kdf present and passphrase provided
        CasService->>CryptoPort: deriveKey({ passphrase, params from manifest })
        CryptoPort-->>CasService: { key, salt, params }
    else
        CasService->>CasService: use provided encryptionKey
    end

    CasService->>CryptoPort: decrypt(chunks, key)
    CryptoPort-->>CasService: decrypted stream
    alt manifest.compression present
        CasService->>Decompression: gunzipAsync(decrypted)
        Decompression-->>CasService: decompressed buffer
    end
    CasService-->>Client: { buffer, bytesWritten }
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Poem

🐇 I nibble on bytes and braid the key tonight,

gzip gives a hug before it takes flight,
pbkdf2 and scrypt sing soft and sweet,
Merkle branches stack our chunks neat,
v2 hops onward — snug, compact, and bright.

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'release: v2.0.0 — compression, KDF, Merkle manifests' clearly and concisely summarizes the three main feature additions (compression, KDF, Merkle manifests) introduced in this v2.0.0 release. It directly reflects the primary changes in the changeset.
Docstring Coverage ✅ Passed Docstring coverage is 85.71% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch release/v2.0.0

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (7)
index.js (1)

181-190: ⚠️ Potential issue | 🔴 Critical

storeFile does not forward passphrase, kdfOptions, or compression to the service.

This is the root cause of the README example issue. The destructured parameters on line 181 only include { filePath, slug, filename, encryptionKey }, so any v2.0.0 options passed by callers are silently dropped.

🐛 Proposed fix
-  async storeFile({ filePath, slug, filename, encryptionKey }) {
+  async storeFile({ filePath, slug, filename, encryptionKey, passphrase, kdfOptions, compression }) {
     const source = createReadStream(filePath);
     const service = await this.#getService();
     return await service.store({
       source,
       slug,
       filename: filename || path.basename(filePath),
       encryptionKey,
+      passphrase,
+      kdfOptions,
+      compression,
     });
   }
GUIDE.md (1)

1384-1384: ⚠️ Potential issue | 🟡 Minor

Stale cross-reference: "see Section 10" should be "see Section 13".

The FAQ answer about resilience policy says "(see Section 10)" but after renumbering, the Architecture section (which covers resilience policy) is now Section 13, not Section 10.

📝 Proposed fix
-construction time (see Section 10).
+construction time (see Section 13).
src/domain/services/CasService.js (2)

350-399: 🛠️ Refactor suggestion | 🟠 Major

Orphaned JSDoc block for restore() is separated from its method.

The JSDoc comment at lines 350-361 originally documented restore(), but the insertion of _resolveKeyFromPassphrase (lines 369-380) and _resolveEncryptionKey (lines 386-397) between the JSDoc and the actual restore method (line 399) means the JSDoc now inadvertently documents _resolveKeyFromPassphrase instead. Move the restore JSDoc immediately above line 399, or remove the orphaned block since _resolveKeyFromPassphrase already has its own JSDoc.

♻️ Proposed fix: remove the orphaned JSDoc
-  /**
-   * Restores a file from its manifest by reading and reassembling chunks.
-   *
-   * If the manifest has encryption metadata, decrypts the reassembled
-   * ciphertext using the provided key.
-   *
-   * `@param` {Object} options
-   * `@param` {import('../value-objects/Manifest.js').default} options.manifest - The file manifest.
-   * `@param` {Buffer} [options.encryptionKey] - 32-byte key, required if manifest is encrypted.
-   * `@returns` {Promise<{ buffer: Buffer, bytesWritten: number }>}
-   * `@throws` {CasError} MISSING_KEY if manifest is encrypted but no key is provided.
-   * `@throws` {CasError} INTEGRITY_ERROR if chunk verification or decryption fails.
-   */
   /**
    * Resolves the encryption key from a passphrase using KDF params from the manifest.

550-573: 🛠️ Refactor suggestion | 🟠 Major

Orphaned JSDoc for verifyIntegrity is separated from its method by deriveKey.

Same issue: the verifyIntegrity JSDoc at lines 550-555 is separated from the actual method at line 573 by the deriveKey method (lines 556-571). The JSDoc attaches to deriveKey instead.

♻️ Proposed fix: move deriveKey above the verifyIntegrity JSDoc, or reorder
+  async deriveKey(options) {
+    return await this.crypto.deriveKey(options);
+  }
+
   /**
    * Verifies the integrity of a stored file by re-hashing its chunks.
    * `@param` {import('../value-objects/Manifest.js').default} manifest
    * `@returns` {Promise<boolean>}
    */
-  /**
-   * Derives an encryption key from a passphrase using PBKDF2 or scrypt.
-   * ...
-   */
-  async deriveKey(options) {
-    return await this.crypto.deriveKey(options);
-  }
-
   async verifyIntegrity(manifest) {
index.d.ts (1)

120-125: ⚠️ Potential issue | 🟠 Major

storeFile() does not accept passphrase, kdfOptions, or compression options, contradicting GUIDE.md examples.

The storeFile() method (lines 120-125 in index.d.ts) accepts only filePath, slug, filename?, and encryptionKey?. However, GUIDE.md demonstrates storeFile() with compression (line 775) and passphrase (line 836).

The implementation (index.js lines 181-189) only forwards four parameters to service.store() and does not forward these options. Users following the GUIDE examples will have these options silently ignored.

Either extend storeFile() to accept and forward these options, or update the GUIDE examples to use store() instead.

docs/API.md (2)

150-163: ⚠️ Potential issue | 🟠 Major

Update method signature to include all parameters.

The method signature on line 150 shows only { filePath, slug, filename, encryptionKey }, but the parameters list documents passphrase, kdfOptions, and compression. This inconsistency will mislead API consumers.

📝 Proposed fix to update the signature
 #### storeFile

 ```javascript
-await cas.storeFile({ filePath, slug, filename, encryptionKey })
+await cas.storeFile({ filePath, slug, filename, encryptionKey, passphrase, kdfOptions, compression })

</details>

---

`211-220`: _⚠️ Potential issue_ | _🟠 Major_

**Update method signature to include `passphrase` parameter.**

The method signature on line 211 omits the `passphrase` parameter that is documented in the parameters list on line 220.



<details>
<summary>📝 Proposed fix to update the signature</summary>

```diff
 #### restoreFile

 ```javascript
-await cas.restoreFile({ manifest, encryptionKey, outputPath })
+await cas.restoreFile({ manifest, encryptionKey, passphrase, outputPath })

</details>

</blockquote></details>

</blockquote></details>
🤖 Fix all issues with AI agents
In `@GUIDE.md`:
- Around line 842-850: The example showing manifest.encryption.kdf includes a
hash field but the types and schema (KdfParams in Manifest.d.ts and KdfSchema in
ManifestSchema.js) do not define it; either remove the hash entry from the docs
example or add a hash property to the type and schema. To fix, decide whether
KDF implementations expose a hash name (e.g., 'sha-512'); if yes, add hash:
string to KdfParams in Manifest.d.ts and add a corresponding { hash: { type:
'string' } } entry to KdfSchema in ManifestSchema.js (including any
validation/defaults), otherwise update the example (manifest.encryption.kdf) to
omit the hash field so it matches the current KdfParams/KdfSchema.
- Around line 937-942: The public ContentAddressableStore constructor currently
ignores the merkleThreshold option shown in GUIDE.md; update the constructor to
accept merkleThreshold and forward it into the CasService initialization so
callers can configure it. Specifically, modify ContentAddressableStore
(constructor in index.js) to include merkleThreshold in its parameter parsing
and pass merkleThreshold when creating new CasService(...), ensuring defaulting
behavior matches CasService (1000) if not provided; alternatively, update
GUIDE.md to remove the merkleThreshold example if you prefer not to expose it.

In `@README.md`:
- Around line 73-79: The README example is accurate but storeFile() currently
ignores passphrase and compression; update the storeFile function (the exported
storeFile in index.js) to accept passphrase and compression in its parameter
destructuring (alongside filePath, slug, filename, encryptionKey) and forward
those options through to service.store({ ... }) so service.store receives
passphrase and compression; also update any related JSDoc/typing for storeFile
and adjust unit tests or callers that rely on its signature.

In `@src/infrastructure/adapters/NodeCryptoAdapter.js`:
- Around line 80-86: The code uses saltBuf.toString('base64') which produces
incorrect output for Uint8Array salts; update NodeCryptoAdapter so params.salt
is created with Buffer.from(saltBuf).toString('base64') (matching
BunCryptoAdapter) and ensure the function returns/wraps the salt value as the
base64 string in the same shape the Bun adapter uses; locate saltBuf,
params.salt and randomBytes in NodeCryptoAdapter and replace the toString call
accordingly.

In `@src/infrastructure/adapters/WebCryptoAdapter.js`:
- Around line 155-165: The private method `#deriveScrypt` currently imports
node:crypto (scrypt) which breaks non-Node environments; update `#deriveScrypt` to
detect availability of node:crypto before importing and if unavailable throw a
clear, descriptive error (e.g., "scrypt not supported in this runtime; requires
Node.js crypto") or return a documented fallback; specifically modify the
`#deriveScrypt` function to wrap the dynamic import in a try/catch or
feature-check and throw that descriptive error (or call an alternative
browser-friendly implementation) so callers of WebCryptoAdapter (per the class
JSDoc) receive a clear failure in non-Node runtimes.
- Around line 134-137: The code currently treats any algorithm value other than
'pbkdf2' as scrypt; update the logic around the opts/key assignment to
explicitly validate the algorithm variable and throw an Error for unsupported
KDFs (same behavior as NodeCryptoAdapter/BunCryptoAdapter) instead of silently
falling back to `#deriveScrypt`; ensure you call this.#derivePbkdf2(opts) only
when algorithm === 'pbkdf2', call this.#deriveScrypt(opts) only when algorithm
=== 'scrypt', and throw Error(`Unsupported KDF algorithm: ${algorithm}`) for any
other value so consumers get a clear failure.
🧹 Nitpick comments (9)
src/domain/value-objects/Manifest.js (2)

25-25: Prefer ?? over || for defaulting version.

data.version || 1 treats any falsy value (including 0) as missing. While the Zod schema currently prevents 0 from reaching this line, using nullish coalescing (??) more precisely communicates the intent of "default when absent" and is resilient against future schema changes.

♻️ Suggested change
-      this.version = data.version || 1;
+      this.version = data.version ?? 1;

46-57: toJSON() correctly includes new fields, but JSDoc @returns is now stale.

The @returns type on line 44 still lists only slug, filename, size, chunks, encryption. Consider updating it to reflect version, compression, and subManifests.

test/unit/domain/services/CasService.merkle.test.js (1)

44-51: readTree regex will throw an unguarded NPE if the entry format changes.

If CasService.createTree ever changes its tree-entry format, match will be null and match[1] will throw a TypeError with no useful message. A guard or descriptive assertion would make test failures easier to diagnose.

♻️ Suggested improvement
       return Promise.resolve(entries.map((e) => {
         const match = e.match(/^(\d+) (\w+) ([^\t]+)\t(.+)$/);
+        if (!match) throw new Error(`readTree mock: unexpected entry format: ${e}`);
         return { mode: match[1], type: match[2], oid: match[3], name: match[4] };
       }));
src/domain/schemas/ManifestSchema.js (1)

16-25: KdfSchema doesn't enforce algorithm-specific required fields.

For pbkdf2, iterations should be required; for scrypt, cost/blockSize/parallelization should be required. Currently all are optional, so a malformed manifest (e.g., scrypt without cost) would pass validation but fail silently at re-derivation. A Zod discriminated union or .refine() would catch this at the schema boundary.

Not blocking since the adapters always produce complete params, but worth hardening if manifests can be hand-crafted or come from untrusted sources.

test/unit/domain/services/CasService.compression.test.js (1)

11-56: Consider extracting shared test helpers (bufferSource, in-memory persistence factory) into a common module.

bufferSource and the in-memory blob store setup are duplicated in CasService.merkle.test.js and here. Extracting them to e.g. test/helpers/cas-test-utils.js would reduce maintenance burden as the test suite grows. Not blocking for this PR.

test/unit/domain/services/CasService.kdf.test.js (1)

258-267: Redundant second restore call with wrong passphrase.

Lines 258-260 already assert that the restore rejects with CasError. The try/catch block at lines 262-266 re-executes the same failing call just to check err.code. This doubles the work (expensive for KDF). Consolidate into a single assertion.

♻️ Suggested consolidation
-    await expect(
-      service.restore({ manifest, passphrase: wrongPassphrase }),
-    ).rejects.toThrow(CasError);
-
-    try {
-      await service.restore({ manifest, passphrase: wrongPassphrase });
-    } catch (err) {
-      expect(err.code).toBe('INTEGRITY_ERROR');
-    }
+    await expect(
+      service.restore({ manifest, passphrase: wrongPassphrase }),
+    ).rejects.toThrow(
+      expect.objectContaining({ code: 'INTEGRITY_ERROR' }),
+    );
src/domain/services/CasService.js (2)

174-183: Inconsistent import style: dynamic imports in _compressStream vs. static import of gunzip at top.

gunzip is statically imported at line 7, but _compressStream dynamically imports createGzip and Readable from the same/sibling modules. Since node:zlib is already a static dependency, prefer static imports for consistency and to avoid the async overhead on each call.

♻️ Suggested: move to static imports at the top of the file
 import { gunzip } from 'node:zlib';
+import { createGzip } from 'node:zlib';
+import { Readable } from 'node:stream';
 import { promisify } from 'node:util';

Then simplify the method:

   async *_compressStream(source) {
-    const { createGzip } = await import('node:zlib');
-    const { Readable } = await import('node:stream');
     const gz = createGzip();
     const input = Readable.from(source);

386-397: _resolveEncryptionKey doesn't await the passphrase path but returns a Promise — verify callers always await.

Line 388 returns the result of _resolveKeyFromPassphrase (a Promise) without await, while the encryptionKey path at line 396 returns Promise.resolve(encryptionKey). Both paths return Promises, so restore at line 400 correctly awaits. This is fine, but the method is not async — it relies on returning raw Promises. Consider making it async for consistency, since it mixes sync throws (line 394) with Promise returns.

Note: The sync throw at line 394 works correctly because the caller awaits the result, and a thrown error in a non-async function that returns a Promise will propagate as an unhandled exception rather than a rejected Promise. However, since restore wraps it in await, the throw occurs before the Promise is created, so it propagates correctly as a synchronous exception from restore's perspective.

Actually, this is a subtle correctness concern: if manifest.encryption?.encrypted is true and no key/passphrase is provided, the throw at line 394 is synchronous. Since _resolveEncryptionKey is not async, this throw propagates synchronously from within restore(). Because restore() is async, the throw will be caught and converted to a rejected promise. This is safe. No action needed unless you prefer uniformity.

docs/API.md (1)

360-375: Consider clarifying when to use deriveKey() directly vs. passphrase parameter.

The example shows manually calling deriveKey() and then passing the derived key to store() via the encryptionKey parameter. Since store() now accepts a passphrase parameter directly, it would help to explain when and why a developer would choose the manual approach (e.g., when you need explicit control over salt/params, want to derive once and reuse the key, or need to store KDF parameters separately for later verification).

💡 Suggested enhancement

Add a clarifying sentence before the example:

 **Example:**
+
+// Manual key derivation gives you explicit control over salt and KDF parameters.
+// Alternatively, use the `passphrase` option directly in store() for automatic KDF.

 ```javascript
 const { key, salt, params } = await cas.deriveKey({

The KDF output example showed `hash: 'sha-512'` but KdfParams schema
does not include a `hash` field. Remove it to match the actual output.
The facade constructor now accepts and forwards merkleThreshold to
CasService, matching the GUIDE documentation example.
storeFile() now accepts and forwards passphrase, kdfOptions, and
compression options to service.store(), matching the store() API.
Uint8Array.prototype.toString() ignores encoding arguments, producing
comma-separated byte lists instead of base64. Wrap with Buffer.from()
to ensure correct base64 output matching BunCryptoAdapter behavior.
Previously any non-pbkdf2 algorithm silently fell through to scrypt.
Now explicitly validates and throws for unsupported algorithms,
matching NodeCryptoAdapter and BunCryptoAdapter behavior.
The scrypt KDF path imports node:crypto which isn't available in
browsers. Wrap the import in try/catch so non-Node runtimes get a
clear error instead of an opaque import failure.
Add Fixed section covering storeFile param forwarding,
NodeCryptoAdapter salt encoding, and WebCryptoAdapter KDF validation.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
GUIDE.md (1)

1377-1384: ⚠️ Potential issue | 🟡 Minor

Stale cross-reference: "see Section 10" should be "see Section 13".

The resilience policy is documented in the Architecture section, which was renumbered from 10 to 13 after the new sections were added.

📝 Proposed fix
-construction time (see Section 10).
+construction time (see Section 13).
🧹 Nitpick comments (4)
index.js (1)

122-136: Factory methods don't forward merkleThreshold (or crypto).

createJson and createCbor don't accept or propagate merkleThreshold, meaning callers using these convenience factories can't opt into Merkle manifests. Consider forwarding it (and optionally crypto) for consistency with the constructor.

♻️ Suggested change
-  static createJson({ plumbing, chunkSize, policy }) {
-    return new ContentAddressableStore({ plumbing, chunkSize, codec: new JsonCodec(), policy });
+  static createJson({ plumbing, chunkSize, policy, crypto, merkleThreshold }) {
+    return new ContentAddressableStore({ plumbing, chunkSize, codec: new JsonCodec(), policy, crypto, merkleThreshold });
   }
-  static createCbor({ plumbing, chunkSize, policy }) {
-    return new ContentAddressableStore({ plumbing, chunkSize, codec: new CborCodec(), policy });
+  static createCbor({ plumbing, chunkSize, policy, crypto, merkleThreshold }) {
+    return new ContentAddressableStore({ plumbing, chunkSize, codec: new CborCodec(), policy, crypto, merkleThreshold });
   }
index.d.ts (1)

98-108: Factory methods createJson/createCbor don't accept merkleThreshold.

These convenience factories only expose plumbing, chunkSize, and policy. If users need merkleThreshold, they must use the main constructor. This is fine if intentional, but worth confirming it's a deliberate design choice — the GUIDE shows merkleThreshold on the constructor, and these factories are meant for quick setup.

src/infrastructure/adapters/NodeCryptoAdapter.js (1)

100-102: Inconsistent error type: plain Error vs CasError.

deriveKey throws a plain Error for unsupported algorithms while the rest of this adapter uses CasError with structured codes. Consider using CasError with a descriptive code (e.g., 'UNSUPPORTED_KDF') for consistency, though this is low priority since the other adapters also use plain Error here.

src/infrastructure/adapters/WebCryptoAdapter.js (1)

147-158: Side effect: #derivePbkdf2 and #deriveScrypt mutate the shared params object.

Both private helpers modify params by reference (e.g., params.iterations = iterations on line 156, params.cost = cost on line 172). This works because params is a freshly created local object in deriveKey, but the mutation-by-reference pattern couples these helpers to the caller's object lifecycle. A purely functional approach (returning the extra fields to be merged by the caller) would be cleaner, but this is a minor style nit given the current scope.

@flyingrobots
Copy link
Member Author

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Feb 7, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

- Move restore() JSDoc directly above its method (was orphaned by inserted helpers)
- Move verifyIntegrity() JSDoc directly above its method (was orphaned by deriveKey)
- Add missing passphrase param to restore() JSDoc
- Update storeFile/restoreFile signatures in API.md to include v2 params
- Fix stale "Section 10" cross-reference to "Section 13" in GUIDE.md
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
docs/API.md (2)

486-498: ⚠️ Potential issue | 🟡 Minor

CasService constructor code block is missing merkleThreshold.

Line 488 shows new CasService({ persistence, codec, crypto, chunkSize }) but merkleThreshold is documented as a parameter at line 497 and is part of the actual constructor signature.

📝 Proposed fix
-new CasService({ persistence, codec, crypto, chunkSize })
+new CasService({ persistence, codec, crypto, chunkSize, merkleThreshold })

519-529: ⚠️ Potential issue | 🟡 Minor

CasService methods list is stale — missing new v2.0.0 parameters and deriveKey.

The method signatures listed here don't reflect the v2.0.0 additions:

  • store() is missing passphrase, kdfOptions, compression
  • restore() is missing passphrase
  • deriveKey(options) is not listed at all
📝 Proposed fix
 All methods from ContentAddressableStore delegate to CasService. See ContentAddressableStore documentation above for:
 
-- `store({ source, slug, filename, encryptionKey })`
-- `restore({ manifest, encryptionKey })`
+- `store({ source, slug, filename, encryptionKey, passphrase, kdfOptions, compression })`
+- `restore({ manifest, encryptionKey, passphrase })`
 - `createTree({ manifest })`
 - `verifyIntegrity(manifest)`
 - `readManifest({ treeOid })`
 - `deleteAsset({ treeOid })`
 - `findOrphanedChunks({ treeOids })`
 - `encrypt({ buffer, key })`
 - `decrypt({ buffer, key, meta })`
+- `deriveKey(options)`
src/domain/services/CasService.js (1)

280-318: ⚠️ Potential issue | 🔴 Critical

Sub-manifest blobs are not included as tree entries and will be garbage-collected.

The sub-manifest blobs are written to Git's ODB via writeBlob() (line 289), with their OIDs stored in the root manifest's subManifests array. However, they are not added as entries in the Git tree (lines 312–315). Only the root manifest blob and chunk blobs are tree entries.

Since Git's garbage collector walks only the object graph (refs → commits → trees → objects) and does not parse blob content for OID references, these sub-manifest blobs are unreachable. Running git gc --prune will delete them as loose objects. Subsequently, readManifest() will fail when attempting to read the sub-manifest blobs by OID.

Add sub-manifest blobs as tree entries
     const treeEntries = [
       `100644 blob ${rootOid}\tmanifest.${this.codec.extension}`,
+      ...subManifestRefs.map((ref) => `100644 blob ${ref.oid}\tsub-manifest-${ref.startIndex}.${this.codec.extension}`),
       ...chunkBlobEntries,
     ];
🧹 Nitpick comments (2)
src/domain/services/CasService.js (2)

7-12: Inconsistent import strategy for node:zlib.

gunzip is statically imported at line 7, but _compressStream (line 175) dynamically imports createGzip from the same module. Consider using a static import for createGzip as well for consistency.

♻️ Suggested refactor
-import { gunzip } from 'node:zlib';
-import { promisify } from 'node:util';
+import { gunzip, createGzip } from 'node:zlib';
+import { promisify } from 'node:util';
+import { Readable } from 'node:stream';

Then in _compressStream:

  async *_compressStream(source) {
-   const { createGzip } = await import('node:zlib');
-   const { Readable } = await import('node:stream');
    const gz = createGzip();

36-48: Consider validating merkleThreshold.

chunkSize is validated (must be ≥ 1024), but merkleThreshold has no validation. A value of 0 or a negative number would cause every non-empty manifest to take the Merkle path, which is likely unintended.

🛡️ Suggested validation
     this.chunkSize = chunkSize;
     this.merkleThreshold = merkleThreshold;
+    if (!Number.isInteger(merkleThreshold) || merkleThreshold < 1) {
+      throw new Error('Merkle threshold must be a positive integer');
+    }
   }

Sub-manifest blobs were written to Git's ODB but not added as tree
entries, making them unreachable and subject to garbage collection.
Add them as `sub-manifest-N.json` entries in the tree.

Also: use static imports for node:zlib/node:stream, validate
merkleThreshold, and update CasService docs in API.md.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/domain/services/CasService.js (1)

203-226: ⚠️ Potential issue | 🟡 Minor

Accepting both passphrase and encryptionKey silently favors the passphrase — consider guarding against ambiguous input.

If a caller passes both passphrase and encryptionKey, lines 205-209 silently overwrite encryptionKey with the derived key. This is surprising behavior that could mask bugs at the call site.

Additionally, compression.algorithm is stored in the manifest (line 225) but _compressStream always uses gzip. If a caller passes { algorithm: 'brotli' }, the manifest metadata will be wrong while the data is still gzip-compressed.

Suggested guard for mutually exclusive key sources and algorithm validation
  async store({ source, slug, filename, encryptionKey, passphrase, kdfOptions, compression }) {
+   if (passphrase && encryptionKey) {
+     throw new CasError(
+       'Provide either encryptionKey or passphrase, not both',
+       'INVALID_OPTIONS',
+     );
+   }
+
    let kdfParams;
    if (passphrase) {
      const derived = await this.deriveKey({ passphrase, ...kdfOptions });
      encryptionKey = derived.key;
      kdfParams = derived.params;
    }

    if (encryptionKey) {
      this._validateKey(encryptionKey);
    }

    const manifestData = {
      slug,
      filename,
      size: 0,
      chunks: [],
    };

    let processedSource = source;
    if (compression) {
+     if (compression.algorithm && compression.algorithm !== 'gzip') {
+       throw new CasError(
+         `Unsupported compression algorithm: ${compression.algorithm}`,
+         'INVALID_OPTIONS',
+       );
+     }
      processedSource = this._compressStream(processedSource);
-     manifestData.compression = { algorithm: compression.algorithm || 'gzip' };
+     manifestData.compression = { algorithm: 'gzip' };
    }
🤖 Fix all issues with AI agents
In `@docs/API.md`:
- Around line 496-499: The docs currently only document the chunkSize validation
error but omit the merkleThreshold validation; update the API docs where options
`chunkSize` and `merkleThreshold` are described to also state that the
constructor throws an Error when `merkleThreshold` is not a positive integer
(matches the runtime check that raises "Merkle threshold must be a positive
integer" in CasService constructor/validation), so add a short Throws line
mentioning both errors (chunkSize < 1024 and invalid merkleThreshold).

In `@test/unit/domain/services/CasService.merkle.test.js`:
- Around line 352-354: Update the duplicated comment header numbering in the
test suite: locate the block with the header text "//
---------------------------------------------------------------------------"
followed by "8. One above threshold uses v2" (the second occurrence) and change
"8." to "9.", then increment the numbering for the following section headers
(currently 9–11) so they read 10–12 to restore correct sequence; ensure you only
modify the comment strings (not test logic) so the descriptive section numbers
are consistent.
🧹 Nitpick comments (2)
src/domain/services/CasService.js (1)

376-391: _resolveEncryptionKey is not async but sometimes returns a Promise — consider making it async for consistency and clearer error propagation.

The method is a synchronous function that returns either the result of _resolveKeyFromPassphrase (a Promise) or Promise.resolve(encryptionKey). This works because the caller awaits the result, but the mix of synchronous throws (line 385, 388) and returned Promises is subtle. Making it async would unify all error paths into rejected promises.

Also, if a caller provides passphrase but the manifest has no kdf metadata (e.g., manifest was stored with a raw key), the method falls through to line 387 and throws MISSING_KEY with a message about "Encryption key required" — which is confusing when the user did provide a passphrase. A more specific error would help debugging.

Suggested fix
- _resolveEncryptionKey(manifest, encryptionKey, passphrase) {
+ async _resolveEncryptionKey(manifest, encryptionKey, passphrase) {
    if (passphrase && manifest.encryption?.kdf) {
      return this._resolveKeyFromPassphrase(passphrase, manifest.encryption.kdf);
    }
+   if (passphrase && manifest.encryption?.encrypted && !manifest.encryption?.kdf) {
+     throw new CasError(
+       'Manifest was not stored with passphrase-based encryption; provide encryptionKey instead',
+       'MISSING_KEY',
+     );
+   }
    if (encryptionKey) {
      this._validateKey(encryptionKey);
    }
    if (manifest.encryption?.encrypted && !encryptionKey) {
      throw new CasError('Encryption key required to restore encrypted content', 'MISSING_KEY');
    }
-   return Promise.resolve(encryptionKey);
+   return encryptionKey;
  }
test/unit/domain/services/CasService.merkle.test.js (1)

490-527: findManifestBlob / findLastManifestBlob rely on blob content not accidentally matching the manifest heuristic.

These helpers identify manifest blobs by checking for slug and filename string fields after JSON-decoding every blob. This works with generateBuffer's binary output (which won't parse as JSON), but it's a fragile coupling. If a future test stores JSON content with those fields, these helpers would return wrong results.

Consider keying off the OID used in the tree entries instead — you already have access to trees in the tests that use it. That said, for the current test suite this is fine.

- Reject providing both passphrase and encryptionKey in store/restore
- Reject unsupported compression algorithms (only gzip supported)
- Better error when passphrase given but manifest lacks KDF metadata
- Make _resolveEncryptionKey async for consistent error propagation
- Fix test section numbering in merkle tests (7→12)
- Document merkleThreshold validation error in API.md
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
docs/API.md (3)

1111-1121: ⚠️ Potential issue | 🟡 Minor

INVALID_OPTIONS error code is missing from the error codes table.

store() and restore() throw CasError with code INVALID_OPTIONS when both passphrase and encryptionKey are provided, or when an unsupported compression algorithm is specified (see CasService.js lines 209–213 and 222–226). This code should be added to the table.

Suggested addition
 | `STREAM_ERROR` | Stream error occurred during store operation | `store()` |
 | `MANIFEST_NOT_FOUND` | No manifest entry found in the Git tree | `readManifest()`, `deleteAsset()`, `findOrphanedChunks()` |
 | `GIT_ERROR` | Underlying Git plumbing command failed | `readManifest()`, `deleteAsset()`, `findOrphanedChunks()` |
+| `INVALID_OPTIONS` | Mutually exclusive options provided or unsupported option value | `store()`, `restore()` |

128-133: ⚠️ Potential issue | 🟡 Minor

Throws section for store() is incomplete.

Missing CasError with code INVALID_OPTIONS thrown when both passphrase and encryptionKey are provided, or when an unsupported compression algorithm is specified.

Suggested addition
 - `CasError` with code `STREAM_ERROR` if the source stream fails
+- `CasError` with code `INVALID_OPTIONS` if both `passphrase` and `encryptionKey` are provided
+- `CasError` with code `INVALID_OPTIONS` if an unsupported compression algorithm is specified

194-201: ⚠️ Potential issue | 🟡 Minor

Throws section for restore() is missing INVALID_OPTIONS.

restore() also rejects when both passphrase and encryptionKey are provided, and when passphrase is used but the manifest lacks KDF metadata (MISSING_KEY). The INVALID_OPTIONS case should be documented.

Suggested addition
 - `CasError` with code `INTEGRITY_ERROR` if decryption fails
+- `CasError` with code `INVALID_OPTIONS` if both `passphrase` and `encryptionKey` are provided
🤖 Fix all issues with AI agents
In `@src/domain/services/CasService.js`:
- Around line 451-453: The decompression step in CasService using gunzipAsync
isn't wrapped and can throw raw zlib errors; wrap the gunzipAsync call (when
manifest.compression is true) in a try/catch and on failure throw a CasError
with type INTEGRITY_ERROR (mirroring the decryption error handling around
decryptBuffer) and include the original error/message for context so callers get
consistent CasError-based error handling.
🧹 Nitpick comments (3)
src/domain/services/CasService.js (2)

187-229: Orphaned JSDoc: store() method has no associated documentation block.

The JSDoc block at lines 187–201 was originally for store(), but the insertion of _validateKeySourceExclusive (line 203) and _validateCompression (line 216) between the JSDoc and the store() declaration at line 229 means the store method's JSDoc is now attached to _validateKeySourceExclusive instead. The store() method itself ends up with no JSDoc.

Move _validateKeySourceExclusive and _validateCompression above the store() JSDoc block, or move the store() JSDoc to directly precede line 229.

Suggested reordering
+  _validateKeySourceExclusive(encryptionKey, passphrase) { ... }
+  _validateCompression(compression) { ... }
+
   /**
    * Chunks an async iterable source and stores it in Git.
    * ...
    */
-  /**
-   * Validates that passphrase and encryptionKey are not both provided.
-   * `@private`
-   */
-  _validateKeySourceExclusive(encryptionKey, passphrase) { ... }
-
-  _validateCompression(compression) { ... }
-
   async store({ source, slug, filename, encryptionKey, passphrase, kdfOptions, compression }) {

172-185: _compressStream lacks error handling for pipe failures.

If the source async iterable throws mid-stream, the error may surface as an opaque ERR_STREAM_PREMATURE_CLOSE from the piped gzip stream rather than the original error. Consider adding an error listener on input to propagate the original error through the compressed stream.

test/unit/domain/services/CasService.merkle.test.js (1)

490-527: findManifestBlob / findLastManifestBlob rely on Map insertion order — consider adding a clarifying comment.

These helpers depend on Map preserving insertion order (guaranteed by the spec) and on the fact that _createMerkleTree writes sub-manifests before the root manifest. The findLastManifestBlob comment at line 511-512 captures this — good. However, findManifestBlob returning the first manifest-like blob could be brittle: for v2 trees, the first manifest blob found would be a sub-manifest (which lacks slug/filename fields), so it works by coincidence since sub-manifests don't have those fields.

This is fine for now but worth being aware of if sub-manifest structure ever gains slug/filename.

…SDoc

- Wrap gunzipAsync in try/catch, throw CasError INTEGRITY_ERROR on failure
- Move _validateKeySourceExclusive and _validateCompression above store()
  JSDoc so the JSDoc attaches to the correct method
- Document INVALID_OPTIONS error code in API.md (store/restore throws,
  error codes table)
- Document INTEGRITY_ERROR for decompression failure in restore() throws
@flyingrobots flyingrobots merged commit 4a0d290 into main Feb 8, 2026
6 checks passed
@flyingrobots flyingrobots deleted the release/v2.0.0 branch February 8, 2026 17:24
@coderabbitai coderabbitai bot mentioned this pull request Feb 8, 2026
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant