-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lr/eqc voting #3851
Lr/eqc voting #3851
Conversation
* tests and CI * remove `async-std` * `fmt` * fix doc build * remove counter tests * remove counter tests * build w/o lockfile * `lock` -> `std` * Revert "`lock` -> `std`" This reverts commit 21ebf05. * lock * `async_broadcast` * overflow * Revert "`async_broadcast`" This reverts commit f03bb57. * `try_send`
crates/types/src/consensus.rs
Outdated
if parent_leaf.height() == 0 { | ||
return true; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure but this is probably not enough. Is there a way to check if the leaf is the genesis leaf?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe just check if proposed_leaf.justify_qc is the gensis qc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think maybe the only way is if the QC view number is the genesis view TYPES::View::genesis
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed the condition to use TYPES::View::genesis
as you suggested. Getting rid of the magic number is an additional benefit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes sense, I need to digest a bit more on how all the logic fits together
if is_vote_leaf_extended { | ||
// We do not check if we are the leader when receiving a vote for the last eQC proposal. | ||
// Every node should receive the votes and form an eQC. | ||
tracing::debug!("We have received a vote for the last eQC proposal"); | ||
} else { | ||
// Are we the leader for this view? | ||
ensure!( | ||
task_state | ||
.quorum_membership | ||
.leader(vote.view_number() + 1, task_state.cur_epoch)? | ||
== task_state.public_key, | ||
info!( | ||
"We are not the leader for view {:?}", | ||
vote.view_number() + 1 | ||
) | ||
); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: ensure!(is_vote_leaf_extended || we_are_leader)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thank you!
crates/types/src/consensus.rs
Outdated
let Some(View { | ||
view_inner: ViewInner::Leaf { | ||
leaf: leaf_commit, .. | ||
}, | ||
}) = self.validated_state_map.get(&parent_leaf.view_number()) | ||
else { | ||
tracing::error!("Couldn't find the provided old leaf in the consensus"); | ||
return false; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have the parent leaf, and need the commit why not just use parent_leaf.commit() instead of looking up in the map (where the leaf presumably already came from)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I guess I got confused by two commit
methods existing for the Leaf
. I didn't want to use the upgrade lock and make the method async.
@@ -963,6 +966,25 @@ impl<TYPES: NodeType> Consensus<TYPES> { | |||
let epoch_number = epoch_from_block_number(next_block_number, self.epoch_height); | |||
Ok(TYPES::Epoch::new(epoch_number)) | |||
} | |||
|
|||
/// Returns true if the `parent_leaf` formed an eQC for the previous epoch to the `proposed_leaf` | |||
pub fn check_eqc(&self, proposed_leaf: &Leaf<TYPES>, parent_leaf: &Leaf<TYPES>) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is checking does proposed_leaf
contain an eQC as it' justify QC, is that right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, correct. I could remove the parent_leaf
parameter, use the proposed_leaf
's justify qc to get the parent leaf from the shared state instead, if it makes more sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was just making sure I was correct, if we already have the parent then it's fine to pass it in, but probably this function makes sense as just starting from the proposed_leaf
crates/types/src/consensus.rs
Outdated
if parent_leaf.height() == 0 { | ||
return true; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe just check if proposed_leaf.justify_qc is the gensis qc?
crates/task-impls/src/network.rs
Outdated
let is_vote_leaf_extended = self | ||
.consensus | ||
.read() | ||
.await | ||
.is_leaf_extended(vote.data.leaf_commit); | ||
let transmit_type = if is_vote_leaf_extended { | ||
tracing::debug!("We're voting for the last eQC proposal. Broadcast the vote."); | ||
TransmitType::Broadcast | ||
} else { | ||
let leader = match self.quorum_membership.leader(view_number, self.epoch) { | ||
Ok(l) => l, | ||
Err(e) => { | ||
tracing::warn!( | ||
"Failed to calculate leader for view number {:?}. Error: {:?}", | ||
view_number, | ||
e | ||
); | ||
return None; | ||
} | ||
}; | ||
TransmitType::Direct(leader) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like this logic is more appropriate in the quorum_vote task, maybe just add the ExtendedVoteSend and handle that event here instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good idea. Done, thank you.
// The proposal is safe if | ||
// 1. the proposed block and the justify QC block belong to the same epoch or | ||
// 2. the justify QC is the eQC for the previous block | ||
let proposal_epoch = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we also need all this logic in the validate_proposal_liveness
path, I suppose we can't actually do it there because we are missing the previous proposal but don't we need something to check the epochs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we cannot check the epochs there. We can only calculate the epoch based on the block number. I don't think we have an access to the parent's block number if we don't have the parent leaf. Did I miss something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so, I was just trying to be sure we didn't. I also don't think we can do the check in the liveness path, and we won't vote in that case either
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some minor things that stuck out to me
@@ -399,6 +399,16 @@ impl< | |||
TransmitType::Direct(leader), | |||
)) | |||
} | |||
HotShotEvent::ExtendedQuorumVoteSend(vote) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a reason we don't want to just use QuorumVoteSend
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -298,6 +298,7 @@ pub(crate) async fn submit_vote<TYPES: NodeType, I: NodeImplementation<TYPES>, V | |||
storage: Arc<RwLock<I::Storage>>, | |||
leaf: Leaf<TYPES>, | |||
vid_share: Proposal<TYPES, VidDisperseShare<TYPES>>, | |||
extended_vote: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand the reason for this either -- are we planning for the submit_vote
to actually be different for an eqc vs a regular qc? it seems like the only difference is we create a different internal hotshot event (which gets broadcast in exactly the same way)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only difference is that the last vote in 3-chain is broadcast instead of just sent to the next leader. To differentiate between the last vote in the 3-chain against any other vote, we need to check the two leaves up the chain. That logic seems to belong to the voting part instead of networking part. We now use the new ExtendedQuorumVoteSend
event to differentiate on the networking part.
Closes #3730
This PR:
This PR does not:
Key places to review: