-
Notifications
You must be signed in to change notification settings - Fork 690
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime refactoring: remove the claim queue from the scheduler #5529
Comments
Another idea for refactoring the scheduler: #5461 (comment) |
I just remembered another reason we had the claim queue storage: We need the predictions in the claim queue to be stable, in particular, entries should not just disappear. If we removed the claim queue, this could happen with on-demand, e.g. when the number of on-demand cores changes we could end up with assignments showing up on different cores, depending on the relay parent: The assignments would move around between cores. |
We could have the coretime assigner take care of this, right? Without that many changes I think. We'll still need the equivalent of So we could merge the claim queue concept into the coretime assigner I wonder if the practicalities of this would complicate things instead |
This stability issue can actually also happen with bulk. Assignments can change any time, hence you could arrive at different results. Arguably getting the fresh information is more correct, but it also means that if you prepared a collation based on the claim queue you will end up with that collation dropped, because the claim queue changed. For bulk this actually seems acceptable ... as those collations had been produced for coretime you did not actually purchase and can also be fixed by having proper TL;DR: We might have an issue with bulk too, but for bulk it is definitely easily fixable. For on-demand it is more tricky, but likely also solvable:
Hence, this is fixable for both bulk and on-demand by simulating the state of the block the claim queue entry is relevant for. (We can look into the future.) This should actually not be too hard and will also lead to a more correct behavior than what we have with the claim queue as of now. Keeping the |
If we remove the TTL, the claim queue in the scheduler pallet does not make much sense any more.
The claim queue remains merely an intermediate buffer to the coretime assigner pallet, which is populated with up to scheduling_lookahead claims.
It also complicates things as assignments need to be pushed back to the assigner on session changes (if the core count decreases).
Backwards compatibility with the claim queue runtime API will be maintainted. The logic will be modified to peek up to
scheduling_lookahead
positions in the coretime assigner.This greatly simplifies the scheduler pallet and helps with the separation of concerns.
As @eskimor pointed out, it also has a nice side effect that the latency between ordering an on-demand core and being able to use the core will decrease. This is because the on-demand order extrinsic reaches the on-demand assigner but not the claim queue (the claim queue is filled during inherent data processing, which happens before extrinsics). There was therefore a latency of at least one block between when the order is fulfilled and when the assignment could be used. If we modify the runtime API to look directly into the assigner, the parachain could utilise this claim right at the next block (granted that the core is available).
The text was updated successfully, but these errors were encountered: