lnwallet/channeldb: restore fee updates after restart#10622
lnwallet/channeldb: restore fee updates after restart#10622ziggie1984 wants to merge 1 commit intolightningnetwork:masterfrom
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses a subtle but critical issue in channel state management where fee updates were not properly restored from disk after a node restart. This oversight could lead to commitment signature verification failures during channel reestablishment, resulting in unnecessary channel force closures. The changes ensure that all pending fee updates are correctly reloaded, maintaining channel consistency and preventing erroneous state discrepancies. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request addresses a critical bug where fee updates were not correctly restored after a restart, potentially leading to incorrect force closures. The refactoring in AdvanceCommitChainTail to prevent an early return is a clean and effective fix, ensuring that all pending local updates are persisted correctly. The addition of TestChanSyncUpdateFeeRestartAfterRevoke is excellent, as it thoroughly covers the specific failure scenario and confirms the fix. The changes are well-implemented and improve the overall robustness of the channel state management.
Note: Security Review is unavailable for this PR.
Fix a bug where UpdateFee entries persisted under remoteUnsignedLocalUpdatesKey were silently skipped during restoreStateLogs, leaving the update log empty after a restart. This caused a retransmitted commit_sig (covering the new fee rate) to fail verification after channel reestablishment, incorrectly triggering a force close. Also refactor AdvanceCommitChainTail to avoid an early return when unsignedAckedUpdates is nil, so that the full validation path always runs. Adds TestChanSyncUpdateFeeRestartAfterRevoke (lnwallet) and TestChannelLinkFailDuringRestart (htlcswitch) to reproduce the scenario at both the channel and link layers.
952f500 to
39b8699
Compare
🔴 PR Severity: CRITICAL
🔴 Critical (1 file)
🟢 Low (2 files — excluded from classification)
AnalysisThe only non-test file changed is The two other changed files are test files ( Bump check: 1 non-test file (threshold: >20) and 18 non-test lines changed (threshold: >500) — no severity bump applied. To override, add a |
Bortlesboat
left a comment
There was a problem hiding this comment.
ACK 39b8699
Good catch — I've seen this exact pattern cause spurious force closes on my node after fee spikes. The channeldb fix is clean. Left a couple of notes on the link test.
| time.Sleep(5 * time.Second) | ||
|
|
||
| case <-time.After(5 * time.Second): | ||
| t.Fatalf("did not receive channel_reestablish message") |
There was a problem hiding this comment.
The time.Sleep(5 * time.Second) here to wait for the force close processing feels brittle — if the test runner is slow this might not be enough, and on fast machines it just wastes time. Could this use a channel signal or poll on a condition instead? Something like piping through the OnChannelFailure callback into a linkErrors chan (like TestChannelLinkUpdateCommitFee does) and selecting on that with a timeout would be more deterministic.
| // TestChannelLinkFailDuringRestart tests that the link fails (force-closing | ||
| // the channel) if the commit dance is not completed after sharing the | ||
| // update_fee and one of the peers restarts before completion. | ||
| // |
There was a problem hiding this comment.
Nit: the test doc comment says "tests that the link fails (force-closing the channel)" but after the fix the assertion is that force close does NOT happen. Might want to update the comment to reflect the corrected behavior, something like "tests that the link does not force-close after a restart when a fee update was pending".
| lnwire.ShortChannelID, LinkFailureError) { | ||
| OnChannelFailure: func(_ lnwire.ChannelID, | ||
| _ lnwire.ShortChannelID, linkErr LinkFailureError) { | ||
|
|
There was a problem hiding this comment.
This changes shared test infrastructure — restartLink is used by ~15 other tests. Adding require.NotEqual(t, linkErr.FailureAction, LinkFailureForceClose) here means any future test that legitimately expects a force close through this path would silently break. Safer to keep the no-op default and override OnChannelFailure in TestChannelLinkFailDuringRestart specifically, like TestChannelLinkUpdateCommitFee does.
Summary
UpdateFeeentries persisted underremoteUnsignedLocalUpdatesKeywere silently skipped duringrestoreStateLogs, leaving the update log empty after a restartcommit_sig(covering the new fee rate) to fail verification after channel reestablishment, incorrectly triggering a force closeAdvanceCommitChainTailto avoid an early return whenunsignedAckedUpdatesis nil, so the full validation path always runsTestChanSyncUpdateFeeRestartAfterRevoketo reproduce the scenarioRoot Cause
During
ReceiveRevocation, the fee update is persisted underremoteUnsignedLocalUpdatesKey. However,restoreStateLogsonly handles HTLC-type updates (UpdateFulfillHTLC,UpdateFailHTLC, etc.) and silently skipsUpdateFeeviadefault: continue. After restart the update log is empty, the expected commitment is computed with the old fee rate, and the signature verification fails.Test Plan
TestChanSyncUpdateFeeRestartAfterRevokepasses