Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prepare Core runtime API for MBMs #13

Merged
merged 19 commits into from
Mar 17, 2024
Merged

Prepare Core runtime API for MBMs #13

merged 19 commits into from
Mar 17, 2024

Conversation

ggwpez
Copy link
Member

@ggwpez ggwpez commented Jul 24, 2023

This RFC aims to ratify the proposed changes to the Core runtime API.

Implementation is being conducted in paritytech/polkadot-sdk#1781.
Status: RFC has been slimmed down and the controversial change to BlockBuilder removed by finding an alternative.

Signed-off-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>
Signed-off-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>
Signed-off-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>
Signed-off-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>
Signed-off-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>
Signed-off-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>
Signed-off-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>
Signed-off-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>
@ggwpez ggwpez changed the title [WIP] Runtime API BlockBuilder v5 and Core v7 Runtime API BlockBuilder v5 and Core v7 Jul 24, 2023
@ggwpez ggwpez marked this pull request as ready for review July 24, 2023 16:22
@ggwpez ggwpez self-assigned this Jul 24, 2023
@ggwpez ggwpez requested a review from bkchr July 24, 2023 16:23
@ggwpez ggwpez changed the title Runtime API BlockBuilder v5 and Core v7 Prepare BlockBuilder and Core runtime API for MBMs Jul 24, 2023
Signed-off-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>
Signed-off-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>
Copy link
Contributor

@kianenigma kianenigma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To the best of my knowledge, the property of "Inherents must come first" is a FRAME concept, and in itself FRAME can rather easily be tweaked to support inherents that might come later in the block.

It seems like this RFC is baking this assumption into Substrate's client side abstractions as well.

Can you elaborate further on the state of the above statement? Is it correct? Can substrate ever support inherents that will not come first?

@tomaka
Copy link
Contributor

tomaka commented Jul 26, 2023

The way the client authors a block is:

  • Call Core_initialize.
  • Call BlockBuilder_inherent_extrinsics, passing as parameter a struct containing for example the timestamp and some parachain head I'm not familiar with. The runtime returns a list of extrinsics that are not yet applied.
  • Call BlockBuilder_apply_extrinsic once per extrinsic returned by BlockBuilder_inherent_extrinsics.
  • (new with this RFC) Call BlockBuilder_after_inherents.
  • Call BlockBuilder_apply_extrinsic once per extrinsic pulled from the transaction pool, until a certain time has passed.
  • Call BlockBuilder_finalize_block.

The Substrate client does not support inherents that don't come first.

As I've mentioned above, the runtime is in control of the list of inherents that will be applied, as it decides of the list returned by BlockBuilder_inherent_extrinsics.

@kianenigma
Copy link
Contributor

The Substrate client does not support inherents that don't come first.

You are mentioning the new procedure, and of course, in the new format the Substrate client only accepts inherents that come first.

But this is a new assumption, as per this RFC, and I am asking why that is the case.

@tomaka
Copy link
Contributor

tomaka commented Jul 26, 2023

This is not a new assumption. The block building procedure that the Substrate client follows always puts inherents first and has always done so.

If you want to add the possibility for the inherents to not come first, you have to add a way for the runtime to somehow tell the client when to insert the inherents, because no such thing exists right now. The client cannot magically know the order in which it can insert things, all it does is follow the instructions from the runtime.

@JoshOrndorff
Copy link

JoshOrndorff commented Jul 26, 2023

I also want to chime in about the idea of inherents always being first. I believe @tomaka's point is that the block builder has always put inherents at the beginning and this is not new. He is correct about that. I believe @kianenigma's point is that this was never specified as the the only way to do it, but rather just the way that the block builder happened to be implemented.

I share @kianenigma's concern that this removes flexibility, and I have a concrete use case. Tuxedo is a runtime framework for building utxo-based chains (it is vaguely analogous to frame).

Constraints:

  1. In the UTXO model there is one very important invariant which is that every piece of state (utxo) is created explicitly as the output of some transaction.
  2. It is also common for the difference in input and output values in the transactions to go to the block author as a reward.

In order for the block author's reward to be explicitly created by a transaction as required by 1, we need an inherent that creates the reward. But in order for it to reflect the correct tip amount which is only known after the other transactions are applied, the inherent must go at the end of the block.

My Request: Please don't bake the assumption that inherents must be first into the Substrate protocol level (I don't care if you do it in FRAME though).

@ggwpez
Copy link
Member Author

ggwpez commented Jul 26, 2023

My Request: Please don't bake the assumption that inherents must be first into the Substrate protocol level (I don't care if you do it in FRAME though).

Yes I see... The proposed design is quite strict with its assumptions. I will try the alternative approach with the transaction ordering that is mentioned in the "Future Directions" section.

@kianenigma
Copy link
Contributor

kianenigma commented Jul 26, 2023

This is not a new assumption. The block building procedure that the Substrate client follows always puts inherents first and has always done so.

The default substrate block_builder is also an implementation and not the specification.

As it stands now, both sides of a contract (block_builder and FRAME) adhere to an arbitrary property (inherents come first), but the contract in between them does not enforce this.

I don't necessarily want to disagree with this new API, but I want us to be clear about what we are doing:

  • Up until now, it would have been possible to make inherents not come first with some tweak to block builder and FRAME.
  • Going forward, this becomes increasingly harder, because it is now more baked into the contract between the client and runtime.

@tomaka
Copy link
Contributor

tomaka commented Jul 27, 2023

The default substrate block_builder is also an implementation and not the specification.

The fact that inherents always come first is very much the specification. It is how it's designed. It's part of the protocol between the runtime and the client.

Even if it was not the specification, as I've said earlier there's technically no way for the client to know when to inject the inherents if it's not first. The ordering of transactions follows strict rules, and the ordering of inherents is the same as what the runtime returns, but the ordering between inherents and transactions would be completely undefined if it wasn't for "inherents always come first".

If you want inherents to not come first, you have to design a new feature. You have to modify the specification.
Again, it's simply not technically possible without designing a new feature, even if you wanted.

I can understand the argument that you want to leave the room later for the possibility to modify the specification so that inherents don't come first, and it is a valid argument against this RFC, but it is wrong to say that inherents do not necessarily come first now.

@pepyakin
Copy link

I think the concept of inherents is limited. Right now we treat inherents as transactions, i.e. each inherent triggers a state transition, but I would argue that they are not transactions, or rather treating them as such is limiting. What they actually are trying to achieve is passing some data from the block builder into the runtime, and as we learned the hard way it's very useful to separate the data submission and data handling. The main motiviation to create the original issue (paritytech/polkadot-sdk#312) was that some of that data is not available in on_initialize. As the solution to this we thought about adding on_after_inherents, which kind of solves some issues, but doesn't solve the others. There was an idea to submit all the inherent data, we would not check/process the data and instead just stash that data somewhere, and finally in on_after_inherents we would finally unstash it and process.

But, the question is, why don't we treat inherents as what it is: just some data that block producer included in the block body?

Here is a straw-man proposal. Imagine there is a special place in a block that is dedicated for inherent data. This data is represented as a (btree)map between the inherent ID and the inherent data. The runtime interface gets a function get_inherent_data(inherent_id) → Option<Vec<u8>>. It's behavior is defined as follows:

  1. During normal block execution, this function should be able to read the inherents from the block directly.
  2. During candidate validation (in polkadot PVF), cumulus would read this data from the block similar to the normal block execution.
  3. During block building, upon the first call this function would look up the sp_inherents::InherentDataProvider for the specified ID and get the corresponding data. If it's not the first call, then the data generated on the first call is returned.

Now, what does this construction give us?

Note that the generation of inherent data now becomes lazy if needed but it doesn't have to be that way and the inherent data could be fixed before the block creation and just returned as is in from the InherentDataProvider. Because the data is not fixed, this allow us to provide the inherent data at the end of the block.

Another consequence, is that there is no such concept as a mandatory/optional inherents. If the inherent data returned is None but the runtime is not content with that, it can immediatelly signal that by e.g. panicking.

If sp_inherents::InherentDataProvider was changed in a way that it can inspect the storage of the system, then it would make it possible to support Tuxedo (as described by @JoshOrndorff here). Specifically, the inherent data provider would access the UTXO module that would provide the difference. Although, an alternative implementation is possible where tips are tallied up through events, this is just a demonstration of flexibility of this approach.

We could extend this proposal even more. The straw-man proposal above uses the static InherentId just because it is not too disimilar to what we have right now. However, it's easy to imagine how this mechanism could benefit from dynamic requests. To give you motivation, right now when we produce a parachain block, the collator supplies data from the relay-chain as inherent. Specifically, that data includes some storage proofs from the relay-chain required for channels to work. This is now implemented by tightly coupling the block production with the parachain-system pallet. I.e. block production knows what storage parachain-system will want to read and reads all the required data from the relay-chain along with the merkle proofs. But turns out people actually want to read the data from the relay-chain or potentially any other parachainchain (e.g. paritytech/polkadot-sdk#82). So if we had two functions provided as inherents: 1. read storage from another chain 2. provide the multiproof of the read storage then we could support this use-case.

I must note however that the tight coupling between inherents checking and block building is desired for efficiency. In the above example, when we fetch the relay-chain storage proofs it won't be too good of an idea to wait until the runtime requests a key that we already know will be requested so we could as well request it in advance. However, note that this inherent mechanism doesn't preclude that: we could still initiate the requests for the storage items we know will be read in advance and then block in the InherentDataProvider only if the data didn't come in time.

Paging in @xlc @bkchr


The only drawback is that the block execution logic becomes more complicated. There is more room for
error.
Downstream developers will also need to adapt their code to this breaking change.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a good definition of breaking changes? This seems to be opt-in and doesn't break node-side code unless the user's runtime is updated.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a 5 line copy&paste diff for the definition of the runtime APIs. It is mentioned in the Integration section here paritytech/polkadot-sdk#1781
Otherwise there should not be anything needed.

for MBMs can happen entirely on the runtime side with the provided primitives.

All block authors MUST respect the `ExtrinsicInclusionMode` that is returned by `initialize_block`.
They MUST always invoke `after_inherents` directly after inherent application. The runtime MAY
Copy link
Contributor

@rphmeier rphmeier Jul 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what parameters does after_inherents take? Is there any way to limit the amount of weight this can spend?

The use-case I'm thinking of is consensus algorithms where the amount of time which can be spent in authoring is variable, e.g. asynchronous backing scenarios where some blocks might be authored in a full 2 seconds and others only in 1 second, due to variable time constraints following from the current length of the backlog.

Currently, block builders have the ability to gauge how heavy transactions are, and to stop authoring after submitting enough transactions.

Block builders should be able to limit the amount of time spent in after_inherents, although probably not arbitrarily.

Perhaps the ExtrinsicInclusionMode should return a minimum amount of weight that must be allotted to the rest of the block, so block builders can decide whether to simply give up authoring or continue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To some extent I'm wondering if it makes sense to flip block-building inside out - in theory we could have a single fn build_block(InherentData, some_total_weight_or_time_limit) and supply new transactions through special host functions instead, so the runtime actually decides whether to pull transactions from the pool, etc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weight is a runtime concept. How much weight you want to spent is also something that the runtime should decide and not the node. All information that you want to pass between initialize and after inherents, can be stored temporarily in the state and doesn't need to be passed to the function.

Regarding driving the block production from inside the runtime. I think there exist an issue or we at least discussed it at some point. The main problem being a panic in a transaction. We can not unwind the panic. We would may need to spawn a new was instance to ensure block production isn't affected by a panic.

Copy link
Contributor

@rphmeier rphmeier Jul 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How much weight you want to spent is also something that the runtime should decide and not the node.

I disagree - there's no good way for the runtime to know how much time is available for authoring in the general case.

As far as I'm aware, MBMs are oriented around doing large migrations in small chunks, especially where execution time is constrained. Having some hint for after_inherents about how much time is available would be useful for that. Though it may need to be an inherent itself

@ggwpez ggwpez changed the title Prepare BlockBuilder and Core runtime API for MBMs [RETRACTED] Prepare BlockBuilder and Core runtime API for MBMs Aug 7, 2023
Signed-off-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>
@ggwpez ggwpez requested review from bkchr and removed request for bkchr October 31, 2023 14:52
Copy link
Contributor

@bkchr bkchr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

ggwpez and others added 5 commits November 14, 2023 13:59
Signed-off-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>
Signed-off-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>
Signed-off-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>
Signed-off-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>
@ggwpez ggwpez changed the title Prepare BlockBuilder and Core runtime API for MBMs Prepare Core runtime API for MBMs Feb 6, 2024
@ggwpez
Copy link
Member Author

ggwpez commented Feb 6, 2024

The RFC has been slimmed down and the controversial change to BlockBuilder removed by finding an alternative.

@kianenigma
Copy link
Contributor

/rfc propose

@paritytech-rfc-bot
Copy link
Contributor

Hey @kianenigma, here is a link you can use to create the referendum aiming to approve this RFC number 0013.

Instructions
  1. Open the link.

  2. Switch to the Submission tab.

  1. Adjust the transaction if needed (for example, the proposal Origin).

  2. Submit the Transaction


It is based on commit hash d4329c544de0ef635c2c51ce2421e16e18e22858.

The proposed remark text is: RFC_APPROVE(0013,3fd88e1343c974f5fb93275aa75691b05a9346e1a535f7c6ce68ffbe37b5cf96).

Co-authored-by: Kian Paimani <5588131+kianenigma@users.noreply.github.com>
@ggwpez
Copy link
Member Author

ggwpez commented Feb 16, 2024

/rfc propose

@paritytech-rfc-bot
Copy link
Contributor

Hey @ggwpez, here is a link you can use to create the referendum aiming to approve this RFC number 0013.

Instructions
  1. Open the link.

  2. Switch to the Submission tab.

  1. Adjust the transaction if needed (for example, the proposal Origin).

  2. Submit the Transaction


It is based on commit hash e2769ade76f2622ccd9d46ff6c59938f51677447.

The proposed remark text is: RFC_APPROVE(0013,39d6e79c7a0ad99b5fd731711beb51d0b3186dbaeb89d05d6c908d0575ffcb49).

@kianenigma
Copy link
Contributor

submitted: https://collectives.subsquare.io/fellowship/referenda/79

Copy link

Voting for this referenda is ongoing.

Vote for it here

github-merge-queue bot pushed a commit to paritytech/polkadot-sdk that referenced this pull request Feb 28, 2024
This MR is the merge of
paritytech/substrate#14414 and
paritytech/substrate#14275. It implements
[RFC#13](polkadot-fellows/RFCs#13), closes
#198.

----- 

This Merge request introduces three major topicals:

1. Multi-Block-Migrations
1. New pallet `poll` hook for periodic service work
1. Replacement hooks for `on_initialize` and `on_finalize` in cases
where `poll` cannot be used

and some more general changes to FRAME.  
The changes for each topical span over multiple crates. They are listed
in topical order below.

# 1.) Multi-Block-Migrations

Multi-Block-Migrations are facilitated by creating `pallet_migrations`
and configuring `System::Config::MultiBlockMigrator` to point to it.
Executive picks this up and triggers one step of the migrations pallet
per block.
The chain is in lockdown mode for as long as an MBM is ongoing.
Executive does this by polling `MultiBlockMigrator::ongoing` and not
allowing any transaction in a block, if true.

A MBM is defined through trait `SteppedMigration`. A condensed version
looks like this:
```rust
/// A migration that can proceed in multiple steps.
pub trait SteppedMigration {
	type Cursor: FullCodec + MaxEncodedLen;
	type Identifier: FullCodec + MaxEncodedLen;

	fn id() -> Self::Identifier;

	fn max_steps() -> Option<u32>;

	fn step(
		cursor: Option<Self::Cursor>,
		meter: &mut WeightMeter,
	) -> Result<Option<Self::Cursor>, SteppedMigrationError>;
}
```

`pallet_migrations` can be configured with an aggregated tuple of these
migrations. It then starts to migrate them one-by-one on the next
runtime upgrade.
Two things are important here:
- 1. Doing another runtime upgrade while MBMs are ongoing is not a good
idea and can lead to messed up state.
- 2. **Pallet Migrations MUST BE CONFIGURED IN `System::Config`,
otherwise it is not used.**

The pallet supports an `UpgradeStatusHandler` that can be used to notify
external logic of upgrade start/finish (for example to pause XCM
dispatch).

Error recovery is very limited in the case that a migration errors or
times out (exceeds its `max_steps`). Currently the runtime dev can
decide in `FailedMigrationHandler::failed` how to handle this. One
follow-up would be to pair this with the `SafeMode` pallet and enact
safe mode when an upgrade fails, to allow governance to rescue the
chain. This is currently not possible, since governance is not
`Mandatory`.

## Runtime API

- `Core`: `initialize_block` now returns `ExtrinsicInclusionMode` to
inform the Block Author whether they can push transactions.

### Integration

Add it to your runtime implementation of `Core` and `BlockBuilder`:
```patch
diff --git a/runtime/src/lib.rs b/runtime/src/lib.rs
@@ impl_runtime_apis! {
	impl sp_block_builder::Core<Block> for Runtime {
-		fn initialize_block(header: &<Block as BlockT>::Header) {
+		fn initialize_block(header: &<Block as BlockT>::Header) -> RuntimeExecutiveMode {
			Executive::initialize_block(header)
		}

		...
	}
```

# 2.) `poll` hook

A new pallet hook is introduced: `poll`. `Poll` is intended to replace
mostly all usage of `on_initialize`.
The reason for this is that any code that can be called from
`on_initialize` cannot be migrated through an MBM. Currently there is no
way to statically check this; the implication is to use `on_initialize`
as rarely as possible.
Failing to do so can result in broken storage invariants.

The implementation of the poll hook depends on the `Runtime API` changes
that are explained above.

# 3.) Hard-Deadline callbacks

Three new callbacks are introduced and configured on `System::Config`:
`PreInherents`, `PostInherents` and `PostTransactions`.
These hooks are meant as replacement for `on_initialize` and
`on_finalize` in cases where the code that runs cannot be moved to
`poll`.
The reason for this is to make the usage of HD-code (hard deadline) more
explicit - again to prevent broken invariants by MBMs.

# 4.) FRAME (general changes)

## `frame_system` pallet

A new memorize storage item `InherentsApplied` is added. It is used by
executive to track whether inherents have already been applied.
Executive and can then execute the MBMs directly between inherents and
transactions.

The `Config` gets five new items:
- `SingleBlockMigrations` this is the new way of configuring migrations
that run in a single block. Previously they were defined as last generic
argument of `Executive`. This shift is brings all central configuration
about migrations closer into view of the developer (migrations that are
configured in `Executive` will still work for now but is deprecated).
- `MultiBlockMigrator` this can be configured to an engine that drives
MBMs. One example would be the `pallet_migrations`. Note that this is
only the engine; the exact MBMs are injected into the engine.
- `PreInherents` a callback that executes after `on_initialize` but
before inherents.
- `PostInherents` a callback that executes after all inherents ran
(including MBMs and `poll`).
- `PostTransactions` in symmetry to `PreInherents`, this one is called
before `on_finalize` but after all transactions.

A sane default is to set all of these to `()`. Example diff suitable for
any chain:
```patch
@@ impl frame_system::Config for Test {
 	type MaxConsumers = ConstU32<16>;
+	type SingleBlockMigrations = ();
+	type MultiBlockMigrator = ();
+	type PreInherents = ();
+	type PostInherents = ();
+	type PostTransactions = ();
 }
```

An overview of how the block execution now looks like is here. The same
graph is also in the rust doc.

<details><summary>Block Execution Flow</summary>
<p>

![Screenshot 2023-12-04 at 19 11
29](https://github.com/paritytech/polkadot-sdk/assets/10380170/e88a80c4-ef11-4faa-8df5-8b33a724c054)

</p>
</details> 

## Inherent Order

Moved to #2154

---------------


## TODO

- [ ] Check that `try-runtime` still works
- [ ] Ensure backwards compatibility with old Runtime APIs
- [x] Consume weight correctly
- [x] Cleanup

---------

Signed-off-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>
Co-authored-by: Liam Aharon <liam.aharon@hotmail.com>
Co-authored-by: Juan Girini <juangirini@gmail.com>
Co-authored-by: command-bot <>
Co-authored-by: Francisco Aguirre <franciscoaguirreperez@gmail.com>
Co-authored-by: Gavin Wood <gavin@parity.io>
Co-authored-by: Bastian Köcher <git@kchr.de>
github-merge-queue bot pushed a commit to paritytech/polkadot-sdk that referenced this pull request Feb 28, 2024
This MR is the merge of
paritytech/substrate#14414 and
paritytech/substrate#14275. It implements
[RFC#13](polkadot-fellows/RFCs#13), closes
#198.

----- 

This Merge request introduces three major topicals:

1. Multi-Block-Migrations
1. New pallet `poll` hook for periodic service work
1. Replacement hooks for `on_initialize` and `on_finalize` in cases
where `poll` cannot be used

and some more general changes to FRAME.  
The changes for each topical span over multiple crates. They are listed
in topical order below.

# 1.) Multi-Block-Migrations

Multi-Block-Migrations are facilitated by creating `pallet_migrations`
and configuring `System::Config::MultiBlockMigrator` to point to it.
Executive picks this up and triggers one step of the migrations pallet
per block.
The chain is in lockdown mode for as long as an MBM is ongoing.
Executive does this by polling `MultiBlockMigrator::ongoing` and not
allowing any transaction in a block, if true.

A MBM is defined through trait `SteppedMigration`. A condensed version
looks like this:
```rust
/// A migration that can proceed in multiple steps.
pub trait SteppedMigration {
	type Cursor: FullCodec + MaxEncodedLen;
	type Identifier: FullCodec + MaxEncodedLen;

	fn id() -> Self::Identifier;

	fn max_steps() -> Option<u32>;

	fn step(
		cursor: Option<Self::Cursor>,
		meter: &mut WeightMeter,
	) -> Result<Option<Self::Cursor>, SteppedMigrationError>;
}
```

`pallet_migrations` can be configured with an aggregated tuple of these
migrations. It then starts to migrate them one-by-one on the next
runtime upgrade.
Two things are important here:
- 1. Doing another runtime upgrade while MBMs are ongoing is not a good
idea and can lead to messed up state.
- 2. **Pallet Migrations MUST BE CONFIGURED IN `System::Config`,
otherwise it is not used.**

The pallet supports an `UpgradeStatusHandler` that can be used to notify
external logic of upgrade start/finish (for example to pause XCM
dispatch).

Error recovery is very limited in the case that a migration errors or
times out (exceeds its `max_steps`). Currently the runtime dev can
decide in `FailedMigrationHandler::failed` how to handle this. One
follow-up would be to pair this with the `SafeMode` pallet and enact
safe mode when an upgrade fails, to allow governance to rescue the
chain. This is currently not possible, since governance is not
`Mandatory`.

## Runtime API

- `Core`: `initialize_block` now returns `ExtrinsicInclusionMode` to
inform the Block Author whether they can push transactions.

### Integration

Add it to your runtime implementation of `Core` and `BlockBuilder`:
```patch
diff --git a/runtime/src/lib.rs b/runtime/src/lib.rs
@@ impl_runtime_apis! {
	impl sp_block_builder::Core<Block> for Runtime {
-		fn initialize_block(header: &<Block as BlockT>::Header) {
+		fn initialize_block(header: &<Block as BlockT>::Header) -> RuntimeExecutiveMode {
			Executive::initialize_block(header)
		}

		...
	}
```

# 2.) `poll` hook

A new pallet hook is introduced: `poll`. `Poll` is intended to replace
mostly all usage of `on_initialize`.
The reason for this is that any code that can be called from
`on_initialize` cannot be migrated through an MBM. Currently there is no
way to statically check this; the implication is to use `on_initialize`
as rarely as possible.
Failing to do so can result in broken storage invariants.

The implementation of the poll hook depends on the `Runtime API` changes
that are explained above.

# 3.) Hard-Deadline callbacks

Three new callbacks are introduced and configured on `System::Config`:
`PreInherents`, `PostInherents` and `PostTransactions`.
These hooks are meant as replacement for `on_initialize` and
`on_finalize` in cases where the code that runs cannot be moved to
`poll`.
The reason for this is to make the usage of HD-code (hard deadline) more
explicit - again to prevent broken invariants by MBMs.

# 4.) FRAME (general changes)

## `frame_system` pallet

A new memorize storage item `InherentsApplied` is added. It is used by
executive to track whether inherents have already been applied.
Executive and can then execute the MBMs directly between inherents and
transactions.

The `Config` gets five new items:
- `SingleBlockMigrations` this is the new way of configuring migrations
that run in a single block. Previously they were defined as last generic
argument of `Executive`. This shift is brings all central configuration
about migrations closer into view of the developer (migrations that are
configured in `Executive` will still work for now but is deprecated).
- `MultiBlockMigrator` this can be configured to an engine that drives
MBMs. One example would be the `pallet_migrations`. Note that this is
only the engine; the exact MBMs are injected into the engine.
- `PreInherents` a callback that executes after `on_initialize` but
before inherents.
- `PostInherents` a callback that executes after all inherents ran
(including MBMs and `poll`).
- `PostTransactions` in symmetry to `PreInherents`, this one is called
before `on_finalize` but after all transactions.

A sane default is to set all of these to `()`. Example diff suitable for
any chain:
```patch
@@ impl frame_system::Config for Test {
 	type MaxConsumers = ConstU32<16>;
+	type SingleBlockMigrations = ();
+	type MultiBlockMigrator = ();
+	type PreInherents = ();
+	type PostInherents = ();
+	type PostTransactions = ();
 }
```

An overview of how the block execution now looks like is here. The same
graph is also in the rust doc.

<details><summary>Block Execution Flow</summary>
<p>

![Screenshot 2023-12-04 at 19 11
29](https://github.com/paritytech/polkadot-sdk/assets/10380170/e88a80c4-ef11-4faa-8df5-8b33a724c054)

</p>
</details> 

## Inherent Order

Moved to #2154

---------------


## TODO

- [ ] Check that `try-runtime` still works
- [ ] Ensure backwards compatibility with old Runtime APIs
- [x] Consume weight correctly
- [x] Cleanup

---------

Signed-off-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>
Co-authored-by: Liam Aharon <liam.aharon@hotmail.com>
Co-authored-by: Juan Girini <juangirini@gmail.com>
Co-authored-by: command-bot <>
Co-authored-by: Francisco Aguirre <franciscoaguirreperez@gmail.com>
Co-authored-by: Gavin Wood <gavin@parity.io>
Co-authored-by: Bastian Köcher <git@kchr.de>
@ggwpez
Copy link
Member Author

ggwpez commented Mar 17, 2024

/rfc process 0xbb5a58e7aef7153e78d37d7248c7114739e0da19040356045d1b56de86813cc2

@paritytech-rfc-bot paritytech-rfc-bot bot merged commit 049a035 into main Mar 17, 2024
@paritytech-rfc-bot
Copy link
Contributor

The on-chain referendum has approved the RFC.

@ggwpez ggwpez deleted the oty-rfc-13 branch March 17, 2024 12:18
bgallois pushed a commit to duniter/duniter-polkadot-sdk that referenced this pull request Mar 25, 2024
…ech#1781)

This MR is the merge of
paritytech/substrate#14414 and
paritytech/substrate#14275. It implements
[RFC#13](polkadot-fellows/RFCs#13), closes
paritytech#198.

----- 

This Merge request introduces three major topicals:

1. Multi-Block-Migrations
1. New pallet `poll` hook for periodic service work
1. Replacement hooks for `on_initialize` and `on_finalize` in cases
where `poll` cannot be used

and some more general changes to FRAME.  
The changes for each topical span over multiple crates. They are listed
in topical order below.

# 1.) Multi-Block-Migrations

Multi-Block-Migrations are facilitated by creating `pallet_migrations`
and configuring `System::Config::MultiBlockMigrator` to point to it.
Executive picks this up and triggers one step of the migrations pallet
per block.
The chain is in lockdown mode for as long as an MBM is ongoing.
Executive does this by polling `MultiBlockMigrator::ongoing` and not
allowing any transaction in a block, if true.

A MBM is defined through trait `SteppedMigration`. A condensed version
looks like this:
```rust
/// A migration that can proceed in multiple steps.
pub trait SteppedMigration {
	type Cursor: FullCodec + MaxEncodedLen;
	type Identifier: FullCodec + MaxEncodedLen;

	fn id() -> Self::Identifier;

	fn max_steps() -> Option<u32>;

	fn step(
		cursor: Option<Self::Cursor>,
		meter: &mut WeightMeter,
	) -> Result<Option<Self::Cursor>, SteppedMigrationError>;
}
```

`pallet_migrations` can be configured with an aggregated tuple of these
migrations. It then starts to migrate them one-by-one on the next
runtime upgrade.
Two things are important here:
- 1. Doing another runtime upgrade while MBMs are ongoing is not a good
idea and can lead to messed up state.
- 2. **Pallet Migrations MUST BE CONFIGURED IN `System::Config`,
otherwise it is not used.**

The pallet supports an `UpgradeStatusHandler` that can be used to notify
external logic of upgrade start/finish (for example to pause XCM
dispatch).

Error recovery is very limited in the case that a migration errors or
times out (exceeds its `max_steps`). Currently the runtime dev can
decide in `FailedMigrationHandler::failed` how to handle this. One
follow-up would be to pair this with the `SafeMode` pallet and enact
safe mode when an upgrade fails, to allow governance to rescue the
chain. This is currently not possible, since governance is not
`Mandatory`.

## Runtime API

- `Core`: `initialize_block` now returns `ExtrinsicInclusionMode` to
inform the Block Author whether they can push transactions.

### Integration

Add it to your runtime implementation of `Core` and `BlockBuilder`:
```patch
diff --git a/runtime/src/lib.rs b/runtime/src/lib.rs
@@ impl_runtime_apis! {
	impl sp_block_builder::Core<Block> for Runtime {
-		fn initialize_block(header: &<Block as BlockT>::Header) {
+		fn initialize_block(header: &<Block as BlockT>::Header) -> RuntimeExecutiveMode {
			Executive::initialize_block(header)
		}

		...
	}
```

# 2.) `poll` hook

A new pallet hook is introduced: `poll`. `Poll` is intended to replace
mostly all usage of `on_initialize`.
The reason for this is that any code that can be called from
`on_initialize` cannot be migrated through an MBM. Currently there is no
way to statically check this; the implication is to use `on_initialize`
as rarely as possible.
Failing to do so can result in broken storage invariants.

The implementation of the poll hook depends on the `Runtime API` changes
that are explained above.

# 3.) Hard-Deadline callbacks

Three new callbacks are introduced and configured on `System::Config`:
`PreInherents`, `PostInherents` and `PostTransactions`.
These hooks are meant as replacement for `on_initialize` and
`on_finalize` in cases where the code that runs cannot be moved to
`poll`.
The reason for this is to make the usage of HD-code (hard deadline) more
explicit - again to prevent broken invariants by MBMs.

# 4.) FRAME (general changes)

## `frame_system` pallet

A new memorize storage item `InherentsApplied` is added. It is used by
executive to track whether inherents have already been applied.
Executive and can then execute the MBMs directly between inherents and
transactions.

The `Config` gets five new items:
- `SingleBlockMigrations` this is the new way of configuring migrations
that run in a single block. Previously they were defined as last generic
argument of `Executive`. This shift is brings all central configuration
about migrations closer into view of the developer (migrations that are
configured in `Executive` will still work for now but is deprecated).
- `MultiBlockMigrator` this can be configured to an engine that drives
MBMs. One example would be the `pallet_migrations`. Note that this is
only the engine; the exact MBMs are injected into the engine.
- `PreInherents` a callback that executes after `on_initialize` but
before inherents.
- `PostInherents` a callback that executes after all inherents ran
(including MBMs and `poll`).
- `PostTransactions` in symmetry to `PreInherents`, this one is called
before `on_finalize` but after all transactions.

A sane default is to set all of these to `()`. Example diff suitable for
any chain:
```patch
@@ impl frame_system::Config for Test {
 	type MaxConsumers = ConstU32<16>;
+	type SingleBlockMigrations = ();
+	type MultiBlockMigrator = ();
+	type PreInherents = ();
+	type PostInherents = ();
+	type PostTransactions = ();
 }
```

An overview of how the block execution now looks like is here. The same
graph is also in the rust doc.

<details><summary>Block Execution Flow</summary>
<p>

![Screenshot 2023-12-04 at 19 11
29](https://github.com/paritytech/polkadot-sdk/assets/10380170/e88a80c4-ef11-4faa-8df5-8b33a724c054)

</p>
</details> 

## Inherent Order

Moved to paritytech#2154

---------------


## TODO

- [ ] Check that `try-runtime` still works
- [ ] Ensure backwards compatibility with old Runtime APIs
- [x] Consume weight correctly
- [x] Cleanup

---------

Signed-off-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>
Co-authored-by: Liam Aharon <liam.aharon@hotmail.com>
Co-authored-by: Juan Girini <juangirini@gmail.com>
Co-authored-by: command-bot <>
Co-authored-by: Francisco Aguirre <franciscoaguirreperez@gmail.com>
Co-authored-by: Gavin Wood <gavin@parity.io>
Co-authored-by: Bastian Köcher <git@kchr.de>
@anaelleltd anaelleltd added the Implemented Is merged or live as a feature/service. label Sep 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Implemented Is merged or live as a feature/service.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants