Skip to content

Descriptor proposal authorization, per-phase timeouts, and protocol hardening#277

Merged
kwsantiago merged 11 commits intomainfrom
descriptor-and-per-phase
Feb 26, 2026
Merged

Descriptor proposal authorization, per-phase timeouts, and protocol hardening#277
kwsantiago merged 11 commits intomainfrom
descriptor-and-per-phase

Conversation

@wksantiago
Copy link
Contributor

@wksantiago wksantiago commented Feb 25, 2026

Summary

  • Add descriptor proposer authorization via allow-list
  • Add per-phase timeouts (contribution, finalize, ack) with session-level cap
  • Verify session_id binding on receiver side
  • Add xpub prefix and min-length validation to DescriptorPropose, DescriptorContribute, and policy tier key slots
  • Validate participant share_index in policy tiers
  • Reap completed/failed sessions after grace period
  • Deduplicate is_expired/expired_phase logic

Test plan

  • All 178 keep-frost-net tests pass
  • All 50 keep-bitcoin tests pass
  • Full workspace build succeeds

Summary by CodeRabbit

  • Bug Fixes

    • Sessions now expire by phase (contribution, finalize, ack, reap) with clear reasons.
    • Enforced proposer authorization to block unauthorized descriptor submissions.
    • Tightened xpub and fingerprint validation (prefixes and exact lengths).
  • Improvements

    • Sessions record per-phase timeouts and emit failure events with phase reasons during cleanup.
    • Finalization and ACK flows hardened; contributor/tampering checks improved.
    • Limits added to announcement tracking to prevent unbounded growth.

@coderabbitai
Copy link

coderabbitai bot commented Feb 25, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Implements phase-based session expiry with per-phase timeouts and reasons, tightens xpub/fingerprint validation and proposer authorization, records phase timestamps, exposes expired_phase() on sessions, changes cleanup_expired() to return expired session IDs with phase reasons, and emits DescriptorFailed events for expired descriptor sessions.

Changes

Cohort / File(s) Summary
Session Expiry / Lifecycle
keep-frost-net/src/descriptor_session.rs
Adds per-phase timeout constants & fields, REAP_GRACE_SECS, timestamps (contributions_complete_at, finalized_at), expired_phase() API, updates is_expired() to use phases, sets timestamps on transitions, and changes cleanup_expired() to return Vec<([u8;32], String)>.
Descriptor Handling (proposal → finalize → ack)
keep-frost-net/src/node/descriptor.rs
Adds proposer authorization checks, validates created_at/derived session_id, tightens xpub/fingerprint validation and length checks, consolidates descriptor length validation, extracts build_key_proof helper, hardens finalize/ACK/NACK flows and error/cleanup behavior.
Node State & Cleanup Integration
keep-frost-net/src/node/mod.rs
Adds descriptor_proposers: RwLock<HashSet<u16>> with accessors and initialization; periodic cleanup consumes cleanup_expired() output and emits DescriptorFailed(session_id, reason) events.
Protocol Constants & Validation
keep-frost-net/src/protocol.rs
Adds DESCRIPTOR_CONTRIBUTION_TIMEOUT_SECS, DESCRIPTOR_FINALIZE_TIMEOUT_SECS, DESCRIPTOR_ACK_PHASE_TIMEOUT_SECS; strengthens xpub prefix/length and fingerprint validation; DescriptorProposePayload::new now accepts created_at.
Tests & Callers
keep-frost-net/src/...tests*, keep-frost-net/src/descriptor_session.rs usages
Adds/adjusts tests for per-phase timeouts, cleanup returning phase reasons, xpub/fingerprint validation; callers updated to capture/ignore cleanup_expired() return as appropriate.

Sequence Diagram(s)

sequenceDiagram
    participant Node as KfpNode
    participant Manager as DescriptorSessionManager
    participant Session as DescriptorSession
    participant Events as EventChannel

    Node->>Manager: cleanup_expired()
    activate Manager
    Manager->>Session: expired_phase() / is_expired()
    activate Session
    Session-->>Manager: (session_id, "phase reason")
    deactivate Session
    Manager-->>Node: Vec<(session_id, phase_reason)>
    deactivate Manager

    alt expired sessions exist
        Node->>Events: emit DescriptorFailed(session_id, reason)
    else none
        Node->>Node: continue normal processing
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related issues

Possibly related PRs

Poem

🐇 I hopped through timestamps, phases in tow,
Contributions sealed, then finalize’s glow,
Xpubs trimmed neat, proposers set the gate,
When sessions expire, the rabbit notes the date,
A tiny cheer for tidy state and fate.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 26.32% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main changes: descriptor proposal authorization, per-phase timeouts, and protocol hardening are the primary objectives evident throughout the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch descriptor-and-per-phase

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
keep-frost-net/src/protocol.rs (1)

372-388: ⚠️ Potential issue | 🟡 Minor

Missing xpub prefix validation for contributions inside DescriptorFinalize.

The finalize payload validates min/max xpub length and fingerprint length for each contribution, but unlike DescriptorContribute and DescriptorPropose, it does not check VALID_XPUB_PREFIXES on contrib.account_xpub. A malicious initiator could forward contributions with an invalid prefix in the finalize message and pass protocol-level validation.

Proposed fix
                     if contrib.account_xpub.len() > MAX_XPUB_LENGTH {
                         return Err("Contribution xpub exceeds maximum length");
                     }
+                    if !VALID_XPUB_PREFIXES
+                        .iter()
+                        .any(|pfx| contrib.account_xpub.starts_with(pfx))
+                    {
+                        return Err("Contribution xpub has invalid prefix");
+                    }
                     if contrib.fingerprint.len() > MAX_FINGERPRINT_LENGTH {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@keep-frost-net/src/protocol.rs` around lines 372 - 388, The
DescriptorFinalize validation loop currently checks lengths but misses verifying
that each contribution's account_xpub has an allowed prefix; update the loop in
protocol.rs (the for (&idx, contrib) in &p.contributions block used when
validating DescriptorFinalize) to assert that contrib.account_xpub starts with
one of the entries in VALID_XPUB_PREFIXES and return an error (consistent style
with other checks) if it does not; use the existing VALID_XPUB_PREFIXES constant
and string starts_with checks to implement this prefix validation and keep error
messages consistent with the other contribution checks.
🧹 Nitpick comments (2)
keep-frost-net/src/descriptor_session.rs (1)

89-95: Per-phase timeouts are hardcoded from constants, not derived from the session-level timeout parameter.

If DescriptorSessionManager::with_timeout(Duration::from_secs(30)) is used, the session-level cap is 30s, but contribution_timeout is still 300s. This is safe because expired_phase() checks the session-level cap first (Line 293), so the session will expire at 30s regardless. However, it means per-phase granularity is effectively lost for custom timeouts.

Consider scaling per-phase timeouts proportionally or accepting them as constructor parameters if configurability becomes needed.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@keep-frost-net/src/descriptor_session.rs` around lines 89 - 95, The per-phase
timeouts (fields contribution_timeout, finalize_timeout, ack_phase_timeout) are
fixed to constants and don't honor the session-level Duration passed to
DescriptorSessionManager::with_timeout, which loses per-phase granularity;
update the constructor (or with_timeout) to derive these fields from the
provided timeout by scaling them proportionally (e.g., multiply timeout by
constants' ratios) or accept per-phase Duration parameters so
contribution_timeout/finalize_timeout/ack_phase_timeout are computed from the
session timeout, and ensure expired_phase still uses the new derived values.
keep-frost-net/src/protocol.rs (1)

29-31: New per-phase timeout constants look good.

The values (300s contribution, 120s finalize, 120s ack) are sensible relative to the existing 600s session-level cap. One naming nit: DESCRIPTOR_ACK_PHASE_TIMEOUT_SECS (Line 31) vs the pre-existing DESCRIPTOR_ACK_TIMEOUT_SECS (Line 28) may confuse callers — consider a doc comment clarifying that the former governs the entire ack collection phase while the latter is the per-peer timeout.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@keep-frost-net/src/protocol.rs` around lines 29 - 31, Add clear doc comments
for the two ACK-related constants to remove ambiguity: update the declarations
of DESCRIPTOR_ACK_PHASE_TIMEOUT_SECS and DESCRIPTOR_ACK_TIMEOUT_SECS to include
a short comment stating that DESCRIPTOR_ACK_PHASE_TIMEOUT_SECS is the timeout
for the entire ack-collection phase across peers, while
DESCRIPTOR_ACK_TIMEOUT_SECS is the per-peer ack response timeout; keep the
current numeric values but make the semantic distinction explicit so callers
understand which timeout to use.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@keep-frost-net/src/protocol.rs`:
- Around line 372-388: The DescriptorFinalize validation loop currently checks
lengths but misses verifying that each contribution's account_xpub has an
allowed prefix; update the loop in protocol.rs (the for (&idx, contrib) in
&p.contributions block used when validating DescriptorFinalize) to assert that
contrib.account_xpub starts with one of the entries in VALID_XPUB_PREFIXES and
return an error (consistent style with other checks) if it does not; use the
existing VALID_XPUB_PREFIXES constant and string starts_with checks to implement
this prefix validation and keep error messages consistent with the other
contribution checks.

---

Nitpick comments:
In `@keep-frost-net/src/descriptor_session.rs`:
- Around line 89-95: The per-phase timeouts (fields contribution_timeout,
finalize_timeout, ack_phase_timeout) are fixed to constants and don't honor the
session-level Duration passed to DescriptorSessionManager::with_timeout, which
loses per-phase granularity; update the constructor (or with_timeout) to derive
these fields from the provided timeout by scaling them proportionally (e.g.,
multiply timeout by constants' ratios) or accept per-phase Duration parameters
so contribution_timeout/finalize_timeout/ack_phase_timeout are computed from the
session timeout, and ensure expired_phase still uses the new derived values.

In `@keep-frost-net/src/protocol.rs`:
- Around line 29-31: Add clear doc comments for the two ACK-related constants to
remove ambiguity: update the declarations of DESCRIPTOR_ACK_PHASE_TIMEOUT_SECS
and DESCRIPTOR_ACK_TIMEOUT_SECS to include a short comment stating that
DESCRIPTOR_ACK_PHASE_TIMEOUT_SECS is the timeout for the entire ack-collection
phase across peers, while DESCRIPTOR_ACK_TIMEOUT_SECS is the per-peer ack
response timeout; keep the current numeric values but make the semantic
distinction explicit so callers understand which timeout to use.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ea12fca and ccade42.

📒 Files selected for processing (4)
  • keep-frost-net/src/descriptor_session.rs
  • keep-frost-net/src/node/descriptor.rs
  • keep-frost-net/src/node/mod.rs
  • keep-frost-net/src/protocol.rs

@wksantiago
Copy link
Contributor Author

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Feb 25, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@wksantiago
Copy link
Contributor Author

@coderabbitai review

@wksantiago wksantiago self-assigned this Feb 26, 2026
@coderabbitai
Copy link

coderabbitai bot commented Feb 26, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

- Replace expect() panic in expired_phase() with safe fallback
- Guard fail() to prevent state downgrade from Complete
- Guard set_initiator against overwrites
- Validate network early in request_descriptor
- Strict fingerprint validation (8 hex chars)
- Align xpub prefix checking with VALID_XPUB_PREFIXES
- Send NACK on key proof failure instead of silent error
- Transition receiver session state after finalize
- Skip events on duplicate proposal
- Cap seen_xpub_announces set at 10k entries
- Guard NACK handler against completed sessions
- Safe timestamp cast with .max(0)
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@keep-frost-net/src/node/descriptor.rs`:
- Around line 641-652: The code currently calls session.set_finalized(...) and
then emits KfpNodeEvent::DescriptorComplete even when set_finalized failed;
change the flow in the block that obtains self.descriptor_sessions.write() and
looks up sessions.get_session_mut(&payload.session_id) so that you only emit the
KfpNodeEvent::DescriptorComplete after session.set_finalized(...) returns Ok —
on Err(e) log the error (as debug/error) and do not emit the DescriptorComplete
event (return early or skip emission). Apply the same change to the similar
block around session.set_finalized at the second location (lines ~654-659) so
both receiver and other finalization paths only emit events on successful
set_finalized().
- Around line 889-896: The current logic checks the 10_000 limit on
self.seen_xpub_announces before detecting duplicates, causing already-seen
dedup_key to error when the set is full; change the order to check membership
first (use seen.contains(&dedup_key) or attempt insert and handle false as
duplicate) and return Ok(()) for duplicates, and only enforce the capacity error
(FrostNetError::Session "Too many xpub announcements tracked") when the key is
not already present; update the block around seen_xpub_announces, dedup_key, and
insert to reflect this reordered logic and ensure no permanent insert occurs if
you decide to insert then reject due to capacity.
- Around line 45-47: The session_id is derived from created_at but
DescriptorProposePayload::new() generates its own timestamp, causing mismatches;
modify the DescriptorProposePayload construction to accept the precomputed
created_at and use that timestamp instead of calling Utc::now() internally so
derive_descriptor_session_id(&self.group_pubkey, &policy, created_at) and
DescriptorProposePayload::new(...) share the exact same created_at value; update
the DescriptorProposePayload::new signature and all call sites (the place
creating the payload near the derive_descriptor_session_id call) to pass
created_at through.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 287c4c1 and 25ee8ec.

📒 Files selected for processing (2)
  • keep-frost-net/src/descriptor_session.rs
  • keep-frost-net/src/node/descriptor.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • keep-frost-net/src/descriptor_session.rs

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
keep-frost-net/src/protocol.rs (1)

306-325: KeySlot validation is comprehensive.

Good checks for Participant share_index and External xpub format. One minor observation: the External fingerprint only validates max length, while initiator_fingerprint (line 280) and DescriptorContribute.fingerprint (line 346) also check for non-empty. If External keys always require a fingerprint, consider adding an empty check for consistency.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@keep-frost-net/src/protocol.rs` around lines 306 - 325, The External arm of
KeySlot validation currently only enforces a maximum fingerprint length; add a
non-empty check so External fingerprints must be present (non-empty) to match
checks like initiator_fingerprint and DescriptorContribute.fingerprint. In the
KeySlot::External branch, after validating xpub length/prefix (using
MIN_XPUB_LENGTH, MAX_XPUB_LENGTH, VALID_XPUB_PREFIXES) and before the
MAX_FINGERPRINT_LENGTH check, return an error if fingerprint.is_empty() (use the
same style/error messages as other checks) to ensure External fingerprint is
required and consistent with initiator_fingerprint and
DescriptorContribute.fingerprint validation.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@keep-frost-net/src/protocol.rs`:
- Around line 306-325: The External arm of KeySlot validation currently only
enforces a maximum fingerprint length; add a non-empty check so External
fingerprints must be present (non-empty) to match checks like
initiator_fingerprint and DescriptorContribute.fingerprint. In the
KeySlot::External branch, after validating xpub length/prefix (using
MIN_XPUB_LENGTH, MAX_XPUB_LENGTH, VALID_XPUB_PREFIXES) and before the
MAX_FINGERPRINT_LENGTH check, return an error if fingerprint.is_empty() (use the
same style/error messages as other checks) to ensure External fingerprint is
required and consistent with initiator_fingerprint and
DescriptorContribute.fingerprint validation.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 25ee8ec and faa7227.

📒 Files selected for processing (2)
  • keep-frost-net/src/node/descriptor.rs
  • keep-frost-net/src/protocol.rs

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
keep-frost-net/src/node/descriptor.rs (1)

634-652: ⚠️ Potential issue | 🟠 Major

DescriptorComplete is emitted even when receiver finalization fails.

If set_finalized fails (or session is missing), this path still emits completion, which can desync node state vs events.

🔧 Proposed fix
-        {
-            let mut sessions = self.descriptor_sessions.write();
-            if let Some(session) = sessions.get_session_mut(&payload.session_id) {
-                if let Err(e) = session.set_finalized(FinalizedDescriptor {
-                    external: payload.external_descriptor.clone(),
-                    internal: payload.internal_descriptor.clone(),
-                    policy_hash: payload.policy_hash,
-                }) {
-                    debug!("Failed to finalize receiver session: {e}");
-                }
-            }
-        }
+        let finalized_ok = {
+            let mut sessions = self.descriptor_sessions.write();
+            if let Some(session) = sessions.get_session_mut(&payload.session_id) {
+                if let Err(e) = session.set_finalized(FinalizedDescriptor {
+                    external: payload.external_descriptor.clone(),
+                    internal: payload.internal_descriptor.clone(),
+                    policy_hash: payload.policy_hash,
+                }) {
+                    debug!("Failed to finalize receiver session: {e}");
+                    false
+                } else {
+                    true
+                }
+            } else {
+                debug!("Descriptor session missing during receiver finalization");
+                false
+            }
+        };
+
+        if !finalized_ok {
+            return Ok(());
+        }
 
         let _ = self.event_tx.send(KfpNodeEvent::DescriptorComplete {
             session_id: payload.session_id,
             external_descriptor: payload.external_descriptor,
             internal_descriptor: payload.internal_descriptor,
             network: session_network,
         });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@keep-frost-net/src/node/descriptor.rs` around lines 634 - 652, The code
currently sends KfpNodeEvent::DescriptorComplete regardless of whether
descriptor_sessions.get_session_mut(...).set_finalized(...) succeeded, which can
desync state; update the logic in the block around
descriptor_sessions/get_session_mut and set_finalized so that the
DescriptorComplete event is only sent when the session exists and set_finalized
returns Ok — i.e., check the result of get_session_mut and the Result from
session.set_finalized(FinalizedDescriptor { ... }) and move the
event_tx.send(KfpNodeEvent::DescriptorComplete { ... }) inside the success
branch (or return/early-exit on Err) so failures (or missing sessions) do not
emit the completion event; reference the identifiers descriptor_sessions,
get_session_mut, set_finalized, FinalizedDescriptor, event_tx and
KfpNodeEvent::DescriptorComplete.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@keep-frost-net/src/descriptor_session.rs`:
- Around line 301-326: The top-level check for self.created_at.elapsed() >
self.timeout triggers "session" for all states and prevents the "reap" branch
for DescriptorSessionState::Complete/Failed from ever running; fix by making the
generic session-timeout check only apply to non-terminal states (e.g., remove or
defer the top-level created_at timeout check) and instead handle the reap logic
first inside the DescriptorSessionState::Complete and
DescriptorSessionState::Failed(_) arm: compute if self.created_at.elapsed() >
self.timeout + Duration::from_secs(REAP_GRACE_SECS) and return Some("reap")
before any session-timeout decision (or move the session timeout check into the
Proposed/Finalized branches), ensuring references to self.created_at,
self.timeout, REAP_GRACE_SECS, DescriptorSessionState::Complete and Failed are
used to locate the changed logic.

---

Duplicate comments:
In `@keep-frost-net/src/node/descriptor.rs`:
- Around line 634-652: The code currently sends KfpNodeEvent::DescriptorComplete
regardless of whether
descriptor_sessions.get_session_mut(...).set_finalized(...) succeeded, which can
desync state; update the logic in the block around
descriptor_sessions/get_session_mut and set_finalized so that the
DescriptorComplete event is only sent when the session exists and set_finalized
returns Ok — i.e., check the result of get_session_mut and the Result from
session.set_finalized(FinalizedDescriptor { ... }) and move the
event_tx.send(KfpNodeEvent::DescriptorComplete { ... }) inside the success
branch (or return/early-exit on Err) so failures (or missing sessions) do not
emit the completion event; reference the identifiers descriptor_sessions,
get_session_mut, set_finalized, FinalizedDescriptor, event_tx and
KfpNodeEvent::DescriptorComplete.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between faa7227 and 4e9ca63.

📒 Files selected for processing (3)
  • keep-frost-net/src/descriptor_session.rs
  • keep-frost-net/src/node/descriptor.rs
  • keep-frost-net/src/protocol.rs

@kwsantiago kwsantiago merged commit a8a2ac8 into main Feb 26, 2026
9 checks passed
@kwsantiago kwsantiago deleted the descriptor-and-per-phase branch February 26, 2026 15:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add descriptor proposal authorization Add per-phase timeouts to descriptor coordination

2 participants