Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lr/increment epoch #3828

Merged
merged 62 commits into from
Nov 13, 2024
Merged

Lr/increment epoch #3828

merged 62 commits into from
Nov 13, 2024

Conversation

lukaszrzasik
Copy link
Contributor

@lukaszrzasik lukaszrzasik commented Oct 31, 2024

Part of #3730

This PR:

  • Adds logic to update epoch if relevant when views are updated
  • Adds view and epoch updating logic when Quorum Certificate is formed

This PR does not:

Key places to review:

@lukaszrzasik lukaszrzasik marked this pull request as ready for review October 31, 2024 14:49
Comment on lines 239 to 246
broadcast_event(
Arc::new(HotShotEvent::ViewChange(
self.view_number + 1,
next_leaf_epoch,
)),
&self.sender,
)
.await;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bfish713 I've moved emitting ViewChange so now it's after the shared state has been updated. Is that fine?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should not matter much, we could even maybe just put the view change in the QuorumVoteDependenciesValidated handler

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or even QuorumVoteSend handler. But either way this should be ok

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok one problem is that some of the previous stuff is faillable, so we might not emit the event in some cases we did before, like we didn't see the last proposal and can't fetch it in time, does the epoch number calculation depend on the state update?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it does depend on it. It's because the epoch depends on the block number. We need to check whether the next proposed leaf will be for the next block or the same block again. We do that by checking the 3-chain.

Now that I think about it again, I don't think that we should switch epoch when we vote for the last leaf of the 3-chain. The fact that a node votes for it doesn't mean that the eQC will be formed and a node should switch to a new epoch only after seeing an eQC.

Is it important to switch view as soon as we vote for the proposal?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we most switch views here, switching views is basically the same as starting the timeout for the next view. If we don't do it here and the next proposal doesn't come in are behavior will not be correct

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed the logic so that the epoch is only updated after validating the proposal or when a node formed a QC.

@lukaszrzasik lukaszrzasik marked this pull request as draft November 7, 2024 14:22
@lukaszrzasik
Copy link
Contributor Author

Moving back to draft because there is some regression in restart test. Need to figure that out.

@lukaszrzasik lukaszrzasik marked this pull request as ready for review November 7, 2024 18:05
Copy link
Contributor

@ss-es ss-es left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some random comments, looks fine on the whole

have some longer questions I will ask on zulip (about the quorum_vote and consensus.rs files)

@@ -99,6 +101,9 @@ impl<TYPES: NodeType> TestStorage<TYPES> {
pub async fn last_actioned_view(&self) -> TYPES::View {
self.inner.read().await.action
}
pub async fn last_epoch(&self) -> TYPES::Epoch {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this be last_actioned_epoch for consistency?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thank you!

Comment on lines 963 to 967
let epoch_number = if next_block_number % self.epoch_height == 0 {
next_block_number / self.epoch_height
} else {
next_block_number / self.epoch_height + 1
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this use the epoch_from_block_number function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I would like to but I would have to make the types crate dependent on the task-impls crate. Do we want to do that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've moved the epoch_from_block_number function to the types crate. This way we can use it in all the places without adding new dependencies.

if self.epoch_height == 0 {
return Ok(TYPES::Epoch::new(0));
}
let Some(validated_view) = self.validated_state_map.get(&view) else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think this (and following let .. = .. else .. could be written more simply as let validated_view = self.validated_state_map.get(&view).context("...")?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the code. Thanks!

lukaszrzasik and others added 11 commits November 8, 2024 12:00
* tests and CI

* remove `async-std`

* `fmt`

* fix doc build

* remove counter tests

* remove counter tests

* build w/o lockfile

* `lock` -> `std`

* Revert "`lock` -> `std`"

This reverts commit 21ebf05.

* lock

* `async_broadcast`

* overflow

* Revert "`async_broadcast`"

This reverts commit f03bb57.

* `try_send`
Comment on lines 973 to 974
/// 1. The leaf from the given view is not for the last block.
/// The epoch should be calculated based on the next block number.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I understand this case, if the leaf is not the last block (say it's height 3 and epoch length is 100), then isn't this the trivial case and can just be handled without going into the validated state map? We already got the height before calling this function right?

Basically I expect the code on forming eqc to just do the decide rule, and if the block decided is the last of the epoch we can advance, all other cases we stay in the current epoch. I can see how the logic kind of does that, but it's hard to follow when the case for "I'm not even trying to forma an eqc" and "I just formed an eqc" end up being handled by the same line of code and It's a little confusing that is_leaf_forming_eqc is false when we form the eqc?

Comment on lines 231 to 236
broadcast_event(
Arc::new(HotShotEvent::ViewChange(new_view, next_view_epoch)),
&event_sender,
)
.await;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we discussed only doing the epoch change when we get the proposal containing this eQC. Idk how much it matters, and I think we'll discuss today the timeout path where this becomes relevant. It's mostly a matter of what epoch do I want to timeout to and when. I don't have a problem with this way, just noting.

@lukaszrzasik lukaszrzasik merged commit c04a16a into main Nov 13, 2024
16 checks passed
@lukaszrzasik lukaszrzasik deleted the lr/increment-epoch branch November 13, 2024 18:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants