Skip to content

Conversation

jimhark
Copy link

@jimhark jimhark commented Jul 21, 2025

Summary

codec.js/decode() may calculate encryptedDataHash multiple times. calculatedHMAC is based on encryptedDataHash which can be arbitrarily large. Recalculating this hash when the data hasn't changed is a performance bug. The fix is to calculate the hash once. While reviewing this change the use of recursion was replaced with explicit retries.

Fixes

Fixes #226

Details

  • Calculate and reuse encryptedDataHash to avoid rehashing
  • Unwind the use of recursion to simplify the code and improve readability
  • function is actually shorter (LoC AND lines of text)

Testing

I manually tested node encrypt, node decrypt, and HTML wrapper decrypt.
It's not clear to me exactly how I could test the backwards compatible password hashes. Additional testing is recommended.

Notes for Reviewers

Multiple hashing has been around a while, but it was not easy to address before PR #219. I'd say at that point the performance issue became a bug because it was fixable.

Probably variable names 'hashedPassword2' and 'hashedPassword3' could be improved. I assigned them in the order they were used, but:

const hashedPassword2 = await cryptoEngine.hashThirdRound(hashedPassword, salt);
let hashedPassword3 = await cryptoEngine.hashSecondRound(hashedPassword, salt);

May be confusing.

jimhark and others added 6 commits July 20, 2025 20:14
This is part of a push to support larger files. The focus is the switch
to using Uint8Array to store binary data. But also includes:

- When running on Node, use Buffer.from() for hex string conversions.

- To avoid large buffer copy, signedMsg as been replaced by an object
    containing  iv, encrypted, and hmac.

- hmac calculation has changed so it avoids copying (possibly very
    large) encrypted data. See signDigest() in lib/codec.js.

- Minor cleanup

Handling hex encode/decode at the input/output boundaries and using
Uint8Array internally for representing binary data has these benefits:

- More memory efficient, allows processing of 2x larger files.

- Aligns with cryptographic best practices: hashing is now performed
  on raw binary data (Uint8Array) instead of hex strings.

- Behavior is (mostly) unchanged
  - scripts/index_template.html textContent is not implemented and
    needs to be redesigned.
makes unnamed function more self documenting
Also removed use of recursion to improve readability (and debugability).
As a bonus, function is actually shorter (LoC AND lines of text)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

codec.js decode() calculates hash of encrypted multiple times
1 participant