Skip to content

[Chunk Data Pack Pruner] Add block view index #6933

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 17 commits into
base: master
Choose a base branch
from

Conversation

zhangchiqing
Copy link
Member

@zhangchiqing zhangchiqing commented Jan 24, 2025

This PR adds an index for lookup block by view. The index is only added to certified block which has received a QC.

This index allows us to iterate through each view, and prune data indexed by the block ID.

Note: The method is not available until either next spork or a migration to build the index is completed.

@codecov-commenter
Copy link

codecov-commenter commented Jan 24, 2025

Codecov Report

Attention: Patch coverage is 50.58824% with 42 lines in your changes missing coverage. Please review.

Project coverage is 41.27%. Comparing base (93369df) to head (5599405).

Files with missing lines Patch % Lines
utils/unittest/fixtures.go 0.00% 20 Missing ⚠️
storage/badger/blocks.go 0.00% 8 Missing ⚠️
state/protocol/badger/mutator.go 68.42% 4 Missing and 2 partials ⚠️
storage/badger/headers.go 86.66% 3 Missing and 1 partial ⚠️
storage/badger/operation/headers.go 20.00% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6933      +/-   ##
==========================================
- Coverage   41.28%   41.27%   -0.02%     
==========================================
  Files        2160     2160              
  Lines      189388   189459      +71     
==========================================
+ Hits        78192    78201       +9     
- Misses     104677   104736      +59     
- Partials     6519     6522       +3     
Flag Coverage Δ
unittests 41.27% <50.58%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.


viewCache: newCache(collector, metrics.ResourceCertifiedView,
withLimit[uint64, flow.Identifier](4*flow.DefaultTransactionExpiry),
withRetrieve(retrieveView)),
Copy link
Member Author

@zhangchiqing zhangchiqing Jan 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cache is only filled by retrieval op.

// certified blocks are the blocks that have received QC.
//
// TODO: this method is not available until next spork (mainnet27) or a migration that builds the index.
// ByView(view uint64) (*flow.Header, error)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will uncomment in next spork.

@zhangchiqing zhangchiqing marked this pull request as ready for review January 24, 2025 19:38
@zhangchiqing zhangchiqing requested a review from a team as a code owner January 24, 2025 19:38
Copy link
Member

@AlexHentschel AlexHentschel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I very much agree with implementing a ByView indexing and retrieval. Though, I am a bit uncomfortable with the assumptions and undocumented dependencies that the proposed change would introduce:

  • the QuorumCertificates populates the ByView index without ever seeing a block, while the Headers just assumes that if that index exist, a corresponding block has been previously stored. Nowhere does QuorumCertificates document that only QCs for known blocks are to be stored.
  • QuorumCertificates and Headers are low-level storage abstractions, but they make very specific assumptions how the higher level code (specifically the Mutator) is supposed to use them:
    // while storing qc, we also index the block by view in the same transaction.
    Also that is not in detail documented as part of the API, so for anyone to not accidentally violate this, they would need to go into he implementation and check, which in my experience engineers don't do.
    • Just hypothetically, if somebody has the great idea to persist QCs on consensus nodes once they construct them (in case the node crashes) and maybe even do that for QCs pertaining to unknown blocks, the assumption made in QuorumCertificates would be void. Its probably not the best example, but it is an example how fragile those assumptions can be.

So in summary, I am to very comfortable with changing the meaning of QuorumCertificates from "storing quorum certificates" to "storing QuorumCertificates only for known blocks". Essentially, we are introducing additional dependencies and assumptions between QuorumCertificates, Headers and Mutator. While in each individual case, the coupling is small, over time such couplings accumulate across the code base and make the code very difficult to maintain and to guarantee correctness. A major focus of our work is managing complexity and guaranteeing correctness, if that requires to write a bit more boilerplate code - I think that would be a good tradeoff.

Some thoughts / suggestions:

  • since Headers retrieves the blocks by view and assumes that for an indexed view a block exists, it should also populate the by-view index in my opinion (keeping the code that deals with the same thing in one place).

  • The Mutator is the module that has the authority to declare blocks as finalized, sealed, etc ... I think the authority to declare a block as certified also falls within that category (certainly I would discourage the give the storage layer the authority to declare a block certified). I would suggest to make that explicit by having the Mutator call a dedicated method IndexCertifiedBlockByView(flow.Identifier, uint64) that is provided by the storage abstraction (e.g. Headers).

  • In the way I see it Blocks is an extension of Headers.

    • Therefore, also Blocks could offer the ByView method
    • Blocks could also offer the DeclareCertifiedTx method and delegate that to IndexCertifiedBlockByView (?) To me, that would seem very consistent.
  • method IndexCertifiedBlockByView would need to be called by the Mutator in the following two places:

@zhangchiqing zhangchiqing force-pushed the leo/add-block-view-index branch from e6cb39d to 98f6014 Compare January 31, 2025 23:21
@@ -365,6 +365,12 @@ func (m *FollowerState) headerExtend(ctx context.Context, candidate *flow.Block,
return fmt.Errorf("could not store incorporated qc: %w", err)
}
} else {
// parent is a block that has been received and certified by a QC.
err := transaction.WithTx(operation.SkipDuplicates(operation.IndexCertifiedBlockByView(parent.View, qc.BlockID)))(tx)
Copy link
Member Author

@zhangchiqing zhangchiqing Feb 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AlexHentschel I was using the db operation directly, for 2 reasons:

  • I wanted to add a IndexCertifiedBlockByView method to headers, and simply call m.headers.IndexCertifiedBlockByView(parent) here. But the storage.Headers interface is mostly for read methods, we didn't expose methods like IndexFinalizedBlockByView because we didn't want this method being misused.
  • For the same reason I also didn't add IndexCertifiedBlockByView method to Blocks, and also, the parent block body doesn't exist in this context, passing a block to blocks.IndexCertifiedBlockByView would require us to make additional query to the parent block body.

@AlexHentschel
Copy link
Member

regarding your suggestion for file storage/badger/operation/headers.go

we might want to change IndexBlockHeight into IndexFinalizedBlockByHeight?

yes, I would appreciate that :-)

@zhangchiqing zhangchiqing force-pushed the leo/add-block-view-index branch from 9953c43 to f383dcb Compare February 26, 2025 23:21
@zhangchiqing
Copy link
Member Author

zhangchiqing commented Feb 26, 2025

we might want to change IndexBlockHeight into IndexFinalizedBlockByHeight?

Fixed cd0ec0c

@zhangchiqing zhangchiqing force-pushed the leo/add-block-view-index branch from f383dcb to cd0ec0c Compare February 26, 2025 23:43
Copy link
Member

@AlexHentschel AlexHentschel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Leo for the great PR. Appreciate your patience with the review time 🙈.

I have created a documentation PR on aspects that I wondered about while reviewing your PR ... mostly forward-facing thoughts on where the mutator code would need to change in the future when moving to pebble. I think the only place where my PR is relevant for your immediate changes is my argument that we don't need to Skip Duplicates (my comments here and here).
I am happy if you merge your PR first, and then we iterate over my documentation - don't feel obligated to include that as part of your PR .... though, don't hesitate to merge mine it into your branch if you feel it would help Jordan with his review 🤷.

// ByView returns the block with the given view. It is only available for certified blocks.
// Certified blocks are the blocks that have received QC. Hotstuff guarantees that for each view,
// at most one block is certified. Hence, the return value of `ByView` is guaranteed to be unique
// oven for non-finalized blocks.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// oven for non-finalized blocks.
// even for non-finalized blocks.
// Expected errors during normal operations:
// - `storage.ErrNotFound` if no certified block is known at given view.

@@ -23,6 +23,12 @@ type Blocks interface {
// for finalized blocks.
ByHeight(height uint64) (*flow.Block, error)

// ByView returns the block with the given view. It is only available for certified blocks.
// certified blocks are the blocks that have received QC.
Copy link
Member

@AlexHentschel AlexHentschel Feb 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// certified blocks are the blocks that have received QC.
// certified blocks are the blocks that have received QC. Hotstuff guarantees that for each view,
// at most one block is certified. Hence, the return value of `ByView` is guaranteed to be unique
// even for non-finalized blocks.
// Expected errors during normal operations:
// - `storage.ErrNotFound` if no certified block is known at given view.

err := headers.Store(block.Header)
require.NoError(t, err)

// verify storing the block doesn't not index the view automatically
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// verify storing the block doesn't not index the view automatically
// Verify storing the block does not index the view automatically. It would not be safe to
// do that, since a byzantine leader might produce multiple proposals for the same view.

_, err = headers.BlockIDByView(block.Header.View)
require.ErrorIs(t, err, storage.ErrNotFound)

// index block by view
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// index block by view
// Though, HotStuff guarantees that for each view at most one block is certified. Hence, we can index
// a block by its view _only_ once the block is certified. When a block is certified is decided by the
// higher-level consensus logic, which then triggers the indexing of the block by its view via a
// dedicated method call:

@@ -48,6 +49,14 @@ func NewHeaders(collector module.CacheMetrics, db *badger.DB) *Headers {
}
}

retrieveView := func(view uint64) func(tx *badger.Txn) (flow.Identifier, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[entirely optional] wondering if it would increase clarity of the code if we renamed:

@@ -110,6 +123,19 @@ func (h *Headers) ByHeight(height uint64) (*flow.Header, error) {
return h.retrieveTx(blockID)(tx)
}

// ByView returns block header for the given view. It is only available for certified blocks.
Copy link
Member

@AlexHentschel AlexHentschel Feb 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// ByView returns block header for the given view. It is only available for certified blocks.
// ByView returns block header for the given view. It is only available for certified blocks.
// Certified blocks are the blocks that have received QC. Hotstuff guarantees that for each view,
// at most one block is certified. Hence, the return value of `ByView` is guaranteed to be unique
// even for non-finalized blocks.
// Expected errors during normal operations:
// - `storage.ErrNotFound` if no certified block is known at given view.
//
// Note: this method is not available until next spork or a migration that builds the index.

Comment on lines +23 to +25
// IndexCertifiedBlockByView indexes the view of a block.
// HotStuff guarantees that there is at most one certified block. Caution: this does not hold for
// uncertified proposals, as byzantine actors might produce multiple proposals for the same block.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// IndexCertifiedBlockByView indexes the view of a block.
// HotStuff guarantees that there is at most one certified block. Caution: this does not hold for
// uncertified proposals, as byzantine actors might produce multiple proposals for the same block.
// IndexCertifiedBlockByView indexes a block by its view.
// HotStuff guarantees that there is at most one certified block per view. Caution: this does not hold for
// uncertified proposals, as a byzantine leader might produce multiple proposals for the same view.

@@ -83,6 +83,17 @@ func (b *Blocks) ByHeight(height uint64) (*flow.Block, error) {
return b.retrieveTx(blockID)(tx)
}

func (b *Blocks) ByView(view uint64) (*flow.Block, error) {
Copy link
Member

@AlexHentschel AlexHentschel Feb 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func (b *Blocks) ByView(view uint64) (*flow.Block, error) {
// ByView returns the block with the given view. It is only available for certified blocks.
// certified blocks are the blocks that have received QC. Hotstuff guarantees that for each view,
// at most one block is certified. Hence, the return value of `ByView` is guaranteed to be unique
// even for non-finalized blocks.
// Expected errors during normal operations:
// - `storage.ErrNotFound` if no certified block is known at given view.
func (b *Blocks) ByView(view uint64) (*flow.Block, error) {

In the Headers implementation, you included the following note:

// Note: this method is not available until next spork or a migration that builds the index.

Not sure if it would be beneficial to also add here?

@@ -365,6 +365,12 @@ func (m *FollowerState) headerExtend(ctx context.Context, candidate *flow.Block,
return fmt.Errorf("could not store incorporated qc: %w", err)
}
} else {
// parent is a block that has been received and certified by a QC.
err := transaction.WithTx(operation.SkipDuplicates(operation.IndexCertifiedBlockByView(parent.View, qc.BlockID)))(tx)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't need to skip duplicates here. The reasoning is in detail documented in my follow-up PR #7101

Suggested change
err := transaction.WithTx(operation.SkipDuplicates(operation.IndexCertifiedBlockByView(parent.View, qc.BlockID)))(tx)
err := transaction.WithTx(operation.IndexCertifiedBlockByView(parent.View, qc.BlockID))(tx)

@@ -389,6 +395,13 @@ func (m *FollowerState) headerExtend(ctx context.Context, candidate *flow.Block,
if err != nil {
return fmt.Errorf("could not store certifying qc: %w", err)
}

// candidate is a block that has been received and certified by a QC
err := transaction.WithTx(operation.SkipDuplicates(operation.IndexCertifiedBlockByView(candidate.Header.View, blockID)))(tx)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't need to skip duplicates here. The reasoning is in detail documented in my follow-up PR #7101

Suggested change
err := transaction.WithTx(operation.SkipDuplicates(operation.IndexCertifiedBlockByView(candidate.Header.View, blockID)))(tx)
err := transaction.WithTx(operation.IndexCertifiedBlockByView(candidate.Header.View, blockID))(tx)

…estions

documentation for `mutator` wrt concurrent additions of blocks
Copy link
Member

@jordanschalm jordanschalm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. There are a few documentation changes I didn't understand, but I think those are from @AlexHentschel's PR.

@@ -135,6 +135,9 @@ func NewFullConsensusState(
// CAUTION:
// - This function expects that `certifyingQC ` has been validated. (otherwise, the state will be corrupted)
// - The parent block must already have been ingested.
// - Attempts to extend the state with the _same block concurrently_ are not allowed.
// (will not corrupt the state, but may lead to an exception)
// Orphaned blocks are excepted.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Orphaned blocks are excepted.

I'm not clear on what this means. Excepted as in the above caution doesn't apply? Or do they trigger an exception? @AlexHentschel

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it means "above caution does not apply", I don't think it is accurate. If it means "triggers an exception" I think we should say that instead.

Comment on lines +203 to +205
// Therefore, they should be identified
// additional are just a no-op if they are identified as such early enough by the `checkBlockAlreadyProcessed` - though if the same extension is added concurrently, we might return a
// [storage.ErrAlreadyExists] error.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Therefore, they should be identified
additional are just a no-op if they are identified as such early enough by the `checkBlockAlreadyProcessed [...]

I'm not understanding this sentence... 😅 @AlexHentschel

// parent has already been ingested. Otherwise, an exception is returned.
// - Attempts to extend the state with the _same block concurrently_ are not allowed.
// (will not corrupt the state, but may lead to an exception)
// Orphaned blocks are excepted.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment here - it isn't clear what "orphaned blocks are excepted" means.

@@ -53,8 +53,17 @@ type FollowerState interface {
// CAUTION:
// - This function expects that `qc` has been validated. (otherwise, the state will be corrupted)
// - The parent block must already be stored.
// - Attempts to extend the state with the _same block concurrently_ are not allowed.
// (will not corrupt the state, but may lead to an exception)
// Orphaned blocks are excepted.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also here

// parent has already been ingested. Otherwise, an exception is returned.
// - Attempts to extend the state with the _same block concurrently_ are not allowed.
// (will not corrupt the state, but may lead to an exception)
// Orphaned blocks are excepted.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and here

Comment on lines 94 to 101
header, err := b.headers.ByView(view)
if err != nil {
return nil, err
}

tx := b.db.NewTransaction(false)
defer tx.Discard()
return b.retrieveTx(header.ID())(tx)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
header, err := b.headers.ByView(view)
if err != nil {
return nil, err
}
tx := b.db.NewTransaction(false)
defer tx.Discard()
return b.retrieveTx(header.ID())(tx)
blockID, err := b.headers.BlockIDByView(view)
if err != nil {
return nil, err
}
tx := b.db.NewTransaction(false)
defer tx.Discard()
return b.retrieveTx(blockID)(tx)

Use the underlying index directly to save a encode/hash and potentially an extra database read.

}

return h
}

// StoreTx allows us to store a new header, as part of a DB transaction, while still going through
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// StoreTx allows us to store a new header, as part of a DB transaction, while still going through
// storeTx allows us to store a new header, as part of a DB transaction, while still going through

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants