-
Notifications
You must be signed in to change notification settings - Fork 740
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prune blobs #3852
Prune blobs #3852
Conversation
I'm not sure what we want to use this field for, a copy of the head of the chain or the block root mapping to the last stored blobs sidecar? For the latter, since we don't store empty blobs sidecars, it doesn't serve as data to calculate if more than |
Wow, this one's turning out to be pretty tricky! Left some comments and would also like to loop in @michaelsproul to see what he thinks
Feel free to delete that one or restructe that struct however you need, I made it thinking we would have a more decoupled block + blob sync design, so really was just best-guessing on fields it would require. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok sorry for the very long review, this melted my brain a little because I was torn between reviewing the conceptual design and the code.
I think for blobs we may need to use a pruning algorithm that's quite different from the ones we use for blocks, states and payloads.
The fundamental reason for this is that we need to prune with or without finality (see discussion here ethereum/consensus-specs#3141).
The high-level algorithm I'm thinking is something like this:
- Keep
try_prune_blobs
method, but call it regularly fromafter_new_head
on a background thread. Make it cheap to bail out if pruning doesn't need to occur yet. - Define whether pruning needs to occur in terms of the
last_pruned_epoch
, the current epoch, and a new configuration parameterepochs_per_blob_prune
which optionally prevents pruning every single epoch. The condition for pruning becomes:
let new_pruned_epoch = current_epoch - *MIN_EPOCHS_FOR_BLOB_SIDECARS_REQUESTS;
let need_to_prune = last_pruned_epoch + epochs_per_blob_prune <= new_pruned_epoch;
- When
need_to_prune == true
, prune the blobs using a forwards block iterator that uses the freezer block roots, and backtracks from the head state if it needs to. Ideally to keep this cheap we have anArc
clone of the head state on hand which we pass in a lazy closure toforwards_block_roots_iterator_until
. The pruning proceeds roughly as it does now, moving fromlast_pruned_epoch
tonew_pruned_epoch
and deleting as it goes. - After pruning finishes we updated
last_pruned_epoch
tonew_pruned_epoch
and store that on disk. - In addition, unconditionally delete blobs for any blocks that conflict with finalization here:
lighthouse/beacon_node/beacon_chain/src/migrate.rs
Lines 571 to 576 in 680a5c7
.flat_map(|block_root: Hash256| { | |
[ | |
StoreOp::DeleteBlock(block_root), | |
StoreOp::DeleteExecutionPayload(block_root), | |
] | |
}) |
This assumes that blob availability guarantees do not need to apply to orphaned blocks that are permanently discarded due to finalization. However maybe this isn't a safe assumption in case of a maliciously finalized chain as Dankrad describes here: ethereum/consensus-specs#3141 (comment). If we wanted to handle that case we'd need a way of collecting blobs from all chains in the pruning window. We could do this using the head tracker as we do for block and state pruning (see this), although that will run slower.
Another complication that my pruning algorithm doesn't address is that pruned blobs should be considered available even after they've been pruned: ethereum/consensus-specs#3169. One way to handle that would be to store a smaller bit of data in a separate column in the database mapping BlockRoot => ()
. Basically a set of block roots for all pruned blocks. We could probably get away with just storing this for orphaned blocks that aren't finalized, as we can already lookup whether a block root is part of the finalized canonical chain (which would imply availability?)
Ok that was a lot! Hopefully this gives us some directions for discussion and progress 🤞
ae97dbd
to
f428c41
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updates look good!
I need to have a more thorough review when I've got more time, but this looks spot on.
Co-authored-by: Michael Sproul <micsproul@gmail.com>
Co-authored-by: Michael Sproul <micsproul@gmail.com>
Co-authored-by: Michael Sproul <micsproul@gmail.com>
Co-authored-by: Michael Sproul <micsproul@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great ! Awesome job @emhane
I've told people to use start using this branch on devnet 4 anyways so we could rebase on latest capella
https://github.com/realbigsean/lighthouse/tree/devnet-4-compat
Is there a plan to support pruning blobs automatically instead of manually? |
@zhiqiangxu It is already automatic |
@michaelsproul Oh I missed the |
Issue Addressed
#3625 (Secondary: blob pruning)
Proposed Changes
Prune blobs after MIN_EPOCHS_FOR_BLOBS_SIDECARS_REQUESTS elapsed
Additional Info