Harden codebase by kwsantiago · Pull Request #137 · privkeyio/keep-esp32

kwsantiago · 2026-02-12T19:04:11Z

Add KEEP_ASSERT macro with 58 precondition checks on crypto/storage paths
Replace all hot-path malloc/free with bounded stack/static buffers
Split 4 oversized functions (frost_sign, frost_commit, frost_session_resume, app_main)
Consolidate duplicate logging macros into log_compat.h

Summary by CodeRabbit

Bug Fixes
- Hardened input validation, bounds checks, and subscription validation; enforced payload/message size limits and safer publish/send timeouts.
- JSON-RPC error codes now preserve standard-range codes and default non-RPC errors to a consistent server error code.
Improvements
- New public "assert" error surfaced as a distinct error code.
- Unified cross-platform logging, tighter memory zeroing, and PSBT moved to fixed-size storage for safer handling.
Tests
- Updated tests to reflect JSON-RPC code defaults and PSBT content validation.

- Add KEEP_ASSERT macro with 58 precondition checks on crypto/storage paths - Replace all hot-path malloc/free with bounded stack/static buffers - Split 4 oversized functions (frost_sign, frost_commit, frost_session_resume, app_main) - Consolidate duplicate logging macros into log_compat.h

coderabbitai · 2026-02-12T19:04:29Z

Warning

Rate limit exceeded

@kwsantiago has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 2 minutes and 35 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

Walkthrough

Adds assertion-based error handling and ERR_ASSERT, centralizes cross-platform logging, replaces several heap allocations with fixed-size buffers, tightens input validation across modules, and refactors FROST signing/commit flows into validation and execution stages.

Changes

Cohort / File(s)	Summary
Error handling & assertions `main/error_codes.h`, `main/error_codes.c`	Adds `ERR_CAT_ASSERT`/`ERR_ASSERT`, `KEEP_ASSERT*` macros, integrates `ERR_ASSERT` into name/string mappings, and exposes `int error_to_jsonrpc_code(int)`.
Logging compatibility `main/log_compat.h`, `main/random_utils.c`, `main/frost_dkg.c`, `main/frost_signer.c`, `main/storage_crypto.c`, `main/storage_export.c`	Introduces `log_compat.h` and migrates modules to use unified `ESP_LOG*` macros (removes platform-specific ad-hoc logging branches).
FROST signer refactor `main/frost_signer.c`, `main/frost_signer_core.c`, `main/frost_signer_storage.c`	Splits commit/sign into validate+generate and validate_rng+check_policy+execute stages, adds session resume helpers, standardizes return codes, and adds input guards; mostly internal reorganization.
Coordinator & WebSocket robustness `main/frost_coordinator.c`	Adds `MAX_WS_MESSAGE_SIZE`, `MAX_PUBLISH_MSG_SIZE`, send timeout constant, safe subscription-id validation, tighter parsing, prompt frees, and timeout-bound sends.
DKG & crypto ops memory hardening `main/frost_dkg.c`, `main/frost_crypto_ops.c`	Replace heap allocations with fixed-size stack buffers, add `secure_memzero` wiping, and use `KEEP_ASSERT` preconditions to reduce dynamic-allocation error paths.
Protocol / PSBT change `main/protocol.h`, `main/protocol.c`, `test/native/test_protocol.c`	Changes `rpc_request_t.psbt` from `char *` to `char psbt[PROTOCOL_MAX_PSBT_LEN]`, adjusts parsing/cleanup and tests to check content (`psbt[0]`) rather than pointer.
Nostr sign payload limits `main/nostr_frost.h`, `main/nostr_frost_sign.c`	Adds `MAX_SIGN_PAYLOAD_SIZE` (4096), enforces hard payload limits, adjusts allocation/serialization, and tightens cleanup.
Checkpoint & storage safety `main/storage_checkpoint.c`, `main/storage.c`, `main/storage_metadata.c`, `main/storage_crypto.c`, `main/storage_export.c`	Adds `CHECKPOINT_MAX_DATA_SIZE`, mutex locking for checkpoint ops on ESP, boundary checks, `KEEP_ASSERT` input guards, and ensures locks/unlocks and secure-zeroing on error paths.
Random utils / logging unification `main/random_utils.c`	Unconditionally includes `log_compat.h` and uses `ESP_LOG*` macros (removes previous no-op logging branch).
Tests updated `test/native/test_protocol.c`	Updated expected JSON-RPC error codes to `-32603` for non-JSONRPC errors and adjusted PSBT presence assertions to check `psbt[0]`.

Sequence Diagram(s)

mermaid
sequenceDiagram
autonumber
participant Client
participant FrostSigner as "Frost Signer"
participant RNG
participant Storage
participant Network
Client->>FrostSigner: sign_request(id, msg)
FrostSigner->>RNG: validate_rng()
alt RNG healthy
RNG-->>FrostSigner: ok
FrostSigner->>Storage: check_policy_and_load(session_id)
Storage-->>FrostSigner: policy, shares
FrostSigner->>FrostSigner: validate inputs (KEEP_ASSERT)
FrostSigner->>FrostSigner: generate commitment / signature
FrostSigner->>Storage: optionally checkpoint/update shares
FrostSigner->>Network: emit signature share / response
Network-->>Client: signature/share response
else RNG unhealthy
RNG-->>FrostSigner: fail
FrostSigner-->>Client: error (ERR_ASSERT / ERR_*)
end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Add Bitcoin PSBT parsing and Taproot sighash extraction #42 — Changes PSBT handling in protocol.{h,c}; directly related to switching rpc_request_t.psbt to a fixed-size array.
Implement FROST signer with NVS storage and two-round signing protocol #3 — Introduces/uses JSON-RPC internal error mapping (-32603); closely related to error_to_jsonrpc_code changes.
Refactor frost_signer to separate crypto, storage, and protocol concerns #108 — Adds assertion/error handling and KEEP_ASSERT patterns that overlap with the new ERR_ASSERT and assertion macros.

Poem

🐰 I twitched my nose and checked the stack,

No leaking mallocs on my track,
KEEP_ASSERT keeps burrows tight,
Logs now glow in cosy light,
Signature hops, secure at night.

🚥 Pre-merge checks | ✅ 2 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check	❓ Inconclusive	The title 'Harden codebase' is vague and generic, using a non-descriptive term that does not convey meaningful information about the specific changes.	Use a more specific title that highlights the main change, such as 'Add assertion macros and refactor hot-path memory allocation' or 'Consolidate logging and add precondition checks'.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Merge Conflict Detection	✅ Passed	✅ No merge conflicts detected when merging into `main`

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch hardening

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

main/nostr_frost.h (1)

77-86: ⚠️ Potential issue | 🟠 Major

Stack overflow risk: frost_sign_request_t is now ~4.2 KB.

Embedding a 4096-byte array directly in the struct means every local frost_sign_request_t variable consumes ~4.2 KB of stack. In frost_coordinator.c's websocket_event_handler, this struct is stack-allocated at line 92 alongside a char msg[4096] buffer (line 74), putting ~8.3 KB on the stack within the same scope. ESP32's default event loop task stack is only ~3.5 KB, making overflow certain.

Consider:

Making frost_sign_request_t static in the coordinator (single active request), or

Keeping the pointer-based payload design with bounded allocation.
main/error_codes.c (1)
359-372: ⚠️ Potential issue | 🟠 Major

error_to_jsonrpc_code default case leaks internal error codes to clients.

The default branch returns the raw internal error code (e.g., 0x0C01 for ERR_ASSERT). JSON-RPC clients should not see internal error taxonomy. Consider mapping all non-protocol errors to -32603 (internal error).
Proposed fix
     default:
-        return code;
+        return -32603; /* JSON-RPC internal error */
     }

🤖 Fix all issues with AI agents

In `@main/frost_coordinator.c`:
- Around line 72-146: The handler currently places a large stack buffer msg
(MAX_WS_MESSAGE_SIZE) and big structs like frost_sign_request_t on the stack
inside the WebSocket callback which can overflow ESP32 task stacks; fix by
removing large stack allocations: copy the incoming payload into heap memory
(malloc length+1) instead of char msg[MAX_WS_MESSAGE_SIZE], or enqueue a
heap-allocated copy of data->data_ptr/length to a worker task so parsing (calls
to cJSON_Parse, frost_parse_sign_request / frost_parse_sign_response /
frost_parse_dkg_round1_event / frost_parse_dkg_round2_event /
frost_parse_nip46_event) runs on a task with a larger stack; alternatively use a
single static buffer protected by g_ctx mutex for msg but still allocate large
result structs (e.g., frost_sign_request_t, frost_sign_response_t,
frost_dkg_round1_t, frost_dkg_round2_t, nip46_request_t) on the heap
(malloc/free) or have parse functions return heap objects; ensure all heap
allocations are freed after callbacks and adjust code paths that currently
expect stack-allocated structs
(frost_sign_request_free/frost_nip46_request_free) accordingly.

In `@main/frost_crypto_ops.c`:
- Around line 87-91: Remove the redundant participant-count check that
duplicates the earlier guard: delete the second "if (group->participant_count >
MAX_GROUP_PARTICIPANTS) { secp256k1_frost_vss_commitments_destroy(vss); return
-4; }" so execution can proceed to declare secp256k1_frost_keygen_secret_share
shares[MAX_GROUP_PARTICIPANTS];; keep the initial check that returns -2 intact
and leave the VSS cleanup only where it is needed (refer to
group->participant_count and secp256k1_frost_vss_commitments_destroy/vss in this
function to locate the duplicated guard).

In `@main/nostr_frost_sign.c`:
- Around line 184-188: The local stack allocation payload_hex (size
MAX_SIGN_PAYLOAD_SIZE * 2 + 1) can be ~8KB and may overflow ESP32 task stacks;
replace the large stack buffer with a heap allocation or a shared static buffer
protected by a mutex. Specifically, in the block that checks
request->payload_len and calls bytes_to_hex and cJSON_AddStringToObject,
allocate payload_hex via malloc(sizeof(...) ) before calling bytes_to_hex and
free it after cJSON_AddStringToObject (or change payload_hex to a static buffer
and guard usage with a mutex if reentrancy is required), ensuring you handle
allocation failure and keep the same behavior for request->payload_len bounds.

In `@main/storage_checkpoint.c`:
- Around line 186-187: Remove the redundant bounds check: the second check
comparing header.data_len to CHECKPOINT_MAX_DATA_SIZE is duplicative of the
earlier check against STORAGE_CHECKPOINT_MAX_SIZE - sizeof(header); keep only
one consistent validation (preferably the existing STORAGE_CHECKPOINT_MAX_SIZE -
sizeof(header) check) and delete the duplicate `if (header.data_len >
CHECKPOINT_MAX_DATA_SIZE) return STORAGE_ERR_INVALID_DATA;` so validation uses a
single canonical limit (header.data_len, STORAGE_CHECKPOINT_MAX_SIZE,
sizeof(header)).
- Around line 120-121: The second bounds check using CHECKPOINT_MAX_DATA_SIZE is
redundant because you already validate len against STORAGE_CHECKPOINT_MAX_SIZE -
sizeof(checkpoint_header_t) earlier; remove the unreachable check (the if (len >
CHECKPOINT_MAX_DATA_SIZE) return STORAGE_ERR_INVALID_DATA;) from the function to
avoid duplicate validation and keep the single canonical check (len >
STORAGE_CHECKPOINT_MAX_SIZE - sizeof(checkpoint_header_t)). Ensure you leave the
original check that references STORAGE_CHECKPOINT_MAX_SIZE and
checkpoint_header_t and keep the same error return STORAGE_ERR_INVALID_DATA
where that first check exists.
- Around line 37-38: The static file-scoped buffer checkpoint_crypto_buf (size
CHECKPOINT_MAX_DATA_SIZE) is shared by storage_checkpoint_save and
storage_checkpoint_load and must be protected for concurrent FreeRTOS tasks; add
a mutex (e.g., a static SemaphoreHandle_t checkpoint_mutex initialized once) and
take the mutex at the start of storage_checkpoint_save and
storage_checkpoint_load and give it before returning to serialize access, or
alternatively document a clear single-threaded invariant if you choose not to
add synchronization; ensure the mutex is created before first use and handle
failures to create/take it gracefully.

In `@main/storage.c`:
- Around line 501-505: storage_load_share currently only uses KEEP_ASSERT for
group and share_hex so when assertions are disabled a NULL group or share_hex
can crash (e.g., inside storage_pad_group_name). Add explicit NULL guards at the
top of storage_load_share: call storage_validate_group_name(group) (or check
group != NULL) and check share_hex != NULL, and return the same error code used
by storage_save_share on invalid args so the behavior matches that function;
ensure you reference storage_pad_group_name and storage_validate_group_name when
adding these checks.
- Around line 565-567: storage_delete_share lacks a real NULL check for the
group parameter (KEEP_ASSERT may be a no-op), which can lead to a crash in
storage_pad_group_name; add an explicit check at the start of
storage_delete_share to handle group == NULL gracefully (e.g., log an error and
return a failure code such as -1 or appropriate errno), ensuring you do not call
storage_pad_group_name when group is NULL and reference the function name
storage_delete_share and storage_pad_group_name so the change is made in the
correct spot.

🧹 Nitpick comments (4)

main/nostr_frost_sign.c (1)
93-103: Residual +1 from null-terminated string era is unnecessary for binary payloads.

The hex_len / 2 + 1 size check (line 93) and out_len argument (line 100) appear to be a leftover from when payload was a null-terminated allocated string. Since payload is now a raw uint8_t buffer and hex_to_bytes writes exactly hex_len / 2 bytes, the + 1 only adds confusion. It doesn't cause a buffer overflow (the guard still keeps it within MAX_SIGN_PAYLOAD_SIZE), but it artificially limits the usable payload size by one byte.
Suggested simplification
-                    if (hex_len / 2 + 1 > MAX_SIGN_PAYLOAD_SIZE) {
+                    if (hex_len / 2 > MAX_SIGN_PAYLOAD_SIZE) {
                         cJSON_Delete(inner);
                         free(decrypted);
                         cJSON_Delete(root);
                         return -7;
                     }
-                    int decoded = hex_to_bytes(payload_hex->valuestring, request->payload,
-                                               hex_len / 2 + 1);
+                    int decoded = hex_to_bytes(payload_hex->valuestring, request->payload,
+                                               MAX_SIGN_PAYLOAD_SIZE);
main/storage_crypto.c (1)
244-252: plaintext_len > 0 assertion has no fallback guard.

The KEEP_ASSERT(plaintext != NULL) at line 244 is backed by the !plaintext check at line 250 — good defense-in-depth. However, KEEP_ASSERT(plaintext_len > 0) at line 248 has no corresponding fallback check. If KEEP_ASSERT is compiled out or becomes a soft warning in some build configurations, a zero-length encrypt call would pass through to mbedtls_gcm_crypt_and_tag unchecked.

The same applies to KEEP_ASSERT(ciphertext_len > 0) at line 280 in storage_crypto_decrypt.
Add fallback check consistent with other guards
     KEEP_ASSERT(plaintext_len > 0);

-    if (!key_initialized || !plaintext || !nonce || !ciphertext || !tag) {
+    if (!key_initialized || !plaintext || !nonce || !ciphertext || !tag || plaintext_len == 0) {
         return -1;
     }
And similarly for decrypt:
     KEEP_ASSERT(ciphertext_len > 0);

-    if (!key_initialized || !ciphertext || !nonce || !tag || !plaintext) {
+    if (!key_initialized || !ciphertext || !nonce || !tag || !plaintext || ciphertext_len == 0) {
         return -1;
     }
main/frost_crypto_ops.c (1)

70-79: KEEP_ASSERT checks duplicate the existing NULL guards below.

Lines 70-73 assert group, round1, secret_shares_out, share_count are non-NULL, then lines 78-79 re-check the exact same pointers. The existing guards are now unreachable (KEEP_ASSERT already returned ERR_ASSERT). This pattern repeats in frost_dkg_finalize (lines 156-166) and frost_sign_partial (lines 237-244).

Not a bug — just dead code after the assertions. Consider removing the redundant if blocks to avoid confusion about which error code is actually returned.

main/storage_checkpoint.c (1)

308-318: KEEP_ASSERT guards followed by now-redundant NULL checks.

Same pattern as noted in frost_crypto_ops.c — the KEEP_ASSERT at lines 309–311 / 378–380 already returns ERR_ASSERT before the manual if (!session_id || !data) checks can execute. The manual checks are dead code.

Also applies to: 377-389

main/frost_coordinator.c

main/frost_crypto_ops.c

main/nostr_frost_sign.c

main/storage_checkpoint.c

main/storage.c

- Heap-allocate large buffers in frost_coordinator WebSocket handler - Validate subscription IDs to prevent JSON injection - Replace portMAX_DELAY with finite 10s timeout on WS sends - Check esp_websocket_client_send_text return values - Revert frost_sign_request_t.payload to heap pointer (was 4KB on stack) - Heap-allocate payload_hex in frost_create_sign_request - Add mutex protection for checkpoint_crypto_buf shared buffer - Map all non-protocol errors to -32603 in JSON-RPC responses - Wrap error context output in #ifndef NDEBUG - Clear sensitive fields (share, dkg_data) in protocol_free_request - Secure-zero share_hex and result buffers in DKG round2 - Remove dead code and redundant NULL checks after KEEP_ASSERT - Reduce session resume timeout from 10x to 3x

coderabbitai

Actionable comments posted: 5

🤖 Fix all issues with AI agents

In `@main/frost_coordinator.c`:
- Around line 406-410: The call to esp_websocket_client_send_text in the
unsubscribe path ignores its return value; update the block that iterates
g_ctx.relays (relay_connection_t *relay / COORDINATOR_STATE_CONNECTED /
relay->ws_handle) to capture the return from
esp_websocket_client_send_text(close_msg, strlen(close_msg),
pdMS_TO_TICKS(WS_SEND_TIMEOUT_MS)), check for non-success, and handle failures
(e.g., log via your logging facility, optionally mark relay->state as
disconnected or schedule a retry/cleanup) so send errors are not silently
ignored.
- Around line 87-95: The code around the websocket message handling (check of
data->op_code, allocation of msg, memcpy and call to cJSON_Parse) has
clang-format violations; run clang-format over this block or reformat so spacing
and indentation follow project style (align conditions, space around operators,
and wrap function calls consistently), ensuring the if statement, malloc/memcpy
lines, and the cJSON_Parse/free(msg) lines are formatted correctly; verify the
changes touch the block that uses data->op_code, msg, memcpy, cJSON_Parse, and
free to resolve the CI formatting errors.
- Around line 27-39: The function is_safe_subscription_id has clang-format style
violations; run clang-format (or apply the project's formatting rules) on that
function to restore consistent indentation and spacing, specifically reflow the
declaration and the for-loop/if blocks inside is_safe_subscription_id so they
match the repository's style; locate the function by name
is_safe_subscription_id and update its formatting (no logic changes) so CI's
clang-format check passes.
- Around line 384-386: The subscribe/unsubscribe path calls
esp_websocket_client_send_text without checking its return value; replicate the
publish_event behavior by capturing the return (e.g., int res =
esp_websocket_client_send_text(...)) and if res < 0 log a warning via
ESP_LOGW(TAG, ...) including relay->url and the error code so send failures are
observable; also fix the clang-format violation by ensuring the if-statement and
its body follow project formatting (brace/indentation) in the subscribe block
that references relay->state, relay->ws_handle and WS_SEND_TIMEOUT_MS.

In `@main/storage_checkpoint.c`:
- Around line 44-55: The clang-format failures stem from single-line if bodies
in checkpoint_lock and checkpoint_unlock not matching project style; update the
implementations to use consistent brace formatting and spacing: ensure the
static variable checkpoint_mutex remains, and change checkpoint_lock and
checkpoint_unlock to use explicit braced blocks for the if checks (referencing
functions checkpoint_lock and checkpoint_unlock and variable checkpoint_mutex)
so each conditional body is on its own line with braces and proper indentation
to satisfy clang-format.

🧹 Nitpick comments (3)

main/frost_coordinator.c (1)

96-164: Consider protecting g_ctx fields accessed in the WS handler with the existing mutex.

The handler reads g_ctx.callbacks, g_ctx.current_group, and g_ctx.privkey without holding g_ctx.mutex, while other tasks can modify these via frost_coordinator_set_callbacks, frost_coordinator_set_group, or frost_coordinator_deinit. Since this is a hardening PR and the mutex already exists, it would be a natural addition to guard against data races here.

main/frost_signer.c (1)

240-275: Message length validation reuses SESSION_ID_* constants — consider named constants for clarity.

Lines 254 and 259 use SESSION_ID_HEX_LEN and SESSION_ID_LEN to validate and parse the message. While numerically correct (messages are 32-byte hashes), using session-ID-specific constants for the message makes the code harder to reason about for future readers.

main/storage_checkpoint.c (1)

46-51: checkpoint_lock silently becomes a no-op if called before checkpoint_init.

When checkpoint_mutex is NULL (before checkpoint_init creates it), checkpoint_lock() silently returns without acquiring any lock. While the current call flow always calls checkpoint_init() first (via storage_checkpoint_save/load), a future caller bypassing init would get no locking silently.

Consider at minimum an assert or log, though the current code flow does guard this correctly.

main/frost_coordinator.c

main/storage_checkpoint.c

- Revert checkpoint_crypto_buf from 24KB static to heap-allocated (fixes dram0_0_seg overflow by 4280 bytes) - Keep mutex protection for thread safety - Fix clang-format violations in coordinator, sign, checkpoint - Add missing direct includes for clang-tidy misc-include-cleaner

- Log warnings on failed subscribe/unsubscribe sends - Add direct includes to frost_signer_core.c and frost_signer_storage.c for clang-tidy misc-include-cleaner

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)

main/storage_checkpoint.c (3)
136-151: ⚠️ Potential issue | 🟠 Major

TOCTOU: existing-checkpoint check races with the lock acquisition.

The read of existing at line 137 and the CHECKPOINT_MAGIC check at line 140 happen before checkpoint_lock() at line 151. Two concurrent callers could both see "no existing checkpoint" and both proceed to write, corrupting data. Move the lock acquisition before the existence check (or move the existence check inside the locked region).
Proposed fix
     if (checkpoint_init() != 0)
         return STORAGE_ERR_NOT_INIT;
 
+    checkpoint_lock();
+
     checkpoint_header_t existing;
     esp_err_t err = esp_partition_read(checkpoint_partition, 0, &existing, sizeof(existing));
-    if (err != ESP_OK)
+    if (err != ESP_OK) {
+        checkpoint_unlock();
         return STORAGE_ERR_IO;
+    }
     if (existing.magic == CHECKPOINT_MAGIC) {
+        checkpoint_unlock();
         return STORAGE_ERR_CHECKPOINT_EXISTS;
     }
 
     checkpoint_header_t header;
     memset(&header, 0, sizeof(header));
     header.magic = CHECKPOINT_MAGIC;
     pad_session_id(header.session_id, session_id);
     header.counter = checkpoint_get_counter();
     header.data_len = (uint16_t)len;
 
-    checkpoint_lock();
-
     uint8_t *encrypted = malloc(len);
259-294: ⚠️ Potential issue | 🟠 Major

storage_checkpoint_clear lacks mutex protection.

This function reads, then erases the checkpoint partition without holding checkpoint_lock. It can race with concurrent storage_checkpoint_save or storage_checkpoint_load calls, leading to corrupted reads/writes. Wrap the critical section with checkpoint_lock()/checkpoint_unlock() similar to the save and load paths.

556-561: ⚠️ Potential issue | 🟡 Minor

Mutex is not destroyed in storage_checkpoint_cleanup, leaking the semaphore handle.

If checkpoint_init / storage_checkpoint_cleanup can be called multiple times (e.g., during testing or re-initialization), the checkpoint_mutex semaphore is never freed, leaking a FreeRTOS kernel object each cycle.
Proposed fix
 void storage_checkpoint_cleanup(void) {
+#ifdef ESP_PLATFORM
+    if (checkpoint_mutex) {
+        vSemaphoreDelete(checkpoint_mutex);
+        checkpoint_mutex = NULL;
+    }
+#endif
     checkpoint_initialized = false;
     checkpoint_partition = NULL;
     checkpoint_counter = 0;
     checkpoint_counter_loaded = false;
 }

🤖 Fix all issues with AI agents

In `@main/frost_coordinator.c`:
- Around line 88-167: The code stack-allocates large event structs
(frost_sign_response_t, frost_dkg_round1_t, frost_dkg_round2_t, nip46_request_t)
inside the WEBSOCKET_EVENT_DATA handling, which can blow the stack (especially
frost_dkg_round1_t); change those local stack allocations to heap allocations
instead: replace locals like "frost_dkg_round1_t r1;" with pointers (e.g.
"frost_dkg_round1_t *r1 = malloc(sizeof(*r1));"), check malloc success, pass the
pointer to frost_parse_dkg_round1_event and the callback, free the pointer after
use, and do the same pattern for frost_sign_response_t, frost_dkg_round2_t and
nip46_request_t (also ensure frost_nip46_request_free is still called where
appropriate and avoid double-free). Locate changes around the event_str handling
and the calls to frost_parse_sign_response, frost_parse_dkg_round1_event,
frost_parse_dkg_round2_event, and frost_parse_nip46_event in
frost_coordinator.c.

🧹 Nitpick comments (1)

main/storage_checkpoint.c (1)

360-363: KEEP_ASSERT will crash on invalid input instead of returning an error.

storage_checkpoint_save (line 126) and storage_checkpoint_clear (line 260) return STORAGE_ERR_INVALID_DATA for null/invalid arguments, but storage_save_session_checkpoint uses KEEP_ASSERT which will abort. On an embedded device, crashing due to a null pointer from a caller is a harsh failure mode. Consider using the same graceful error-return pattern for consistency and resilience, reserving KEEP_ASSERT for true invariants that indicate internal logic errors.

main/frost_coordinator.c

- Heap-allocate frost_sign_response_t, frost_dkg_round1_t (~1.1KB), frost_dkg_round2_t, nip46_request_t in websocket handler - Fix TOCTOU: move checkpoint_lock() before existence check in save - Add mutex protection to storage_checkpoint_clear - Destroy mutex in storage_checkpoint_cleanup - Add direct includes for clang-tidy misc-include-cleaner

kwsantiago self-assigned this Feb 12, 2026

coderabbitai bot reviewed Feb 12, 2026

View reviewed changes

main/frost_coordinator.c Show resolved Hide resolved

main/frost_coordinator.c Show resolved Hide resolved

main/frost_coordinator.c Outdated Show resolved Hide resolved

main/frost_coordinator.c Show resolved Hide resolved

main/storage_checkpoint.c Show resolved Hide resolved

kwsantiago added 2 commits February 12, 2026 15:50

Check WS send return in subscribe/unsubscribe, fix clang-tidy includes

a5fd784

- Log warnings on failed subscribe/unsubscribe sends - Add direct includes to frost_signer_core.c and frost_signer_storage.c for clang-tidy misc-include-cleaner

coderabbitai bot reviewed Feb 12, 2026

View reviewed changes

main/frost_coordinator.c Show resolved Hide resolved

kwsantiago added 2 commits February 12, 2026 15:59

Fix clang-tidy CI: remove ESP_PLATFORM=0 that triggers esp_log.h include

b7223ac

kwsantiago merged commit a73b2bb into main Feb 12, 2026
9 checks passed

kwsantiago deleted the hardening branch February 12, 2026 21:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Harden codebase#137

Harden codebase#137
kwsantiago merged 6 commits intomainfrom
hardening

kwsantiago commented Feb 12, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Feb 12, 2026 •

edited

Loading

Rate limit exceeded

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

Conversation

kwsantiago commented Feb 12, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

kwsantiago commented Feb 12, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 12, 2026 •

edited

Loading