[ElasticScaling] Enable ElasticScaling on AHKusama and AHPolkadot

## Goal

As part of our low-latency roadmap, we are rolling out elastic scaling on Asset Hub Kusama (AHK) and Asset Hub Polkadot (AHP).
The effect of this change would bring ~2s blocks using 3 cores.

The rollout follows a gradual strategy: test-nets (Versi, Westend, Paseo), then Kusama, then Polkadot.

This is part of:
- https://github.com/polkadot-fellows/runtimes/issues/757


## Test Nets Triaging

### Asset Hub Westend

Elastic Scaling is enabled on AssetHubWestend, using the following PR:
- https://github.com/paritytech/polkadot-sdk/pull/9880

We have discovered that the asset hub westend is not in an ideal shape:
- https://github.com/paritytech/polkadot-sdk/issues/10227

Multiple collations were not advertised to validators, some validators were offline and some validator records could not be found in the DHT.  We believe the issues are not related to the elastic scaling feature and work is in progress together with the Devops team to bring the chain to a stable state.
- https://github.com/paritytech/polkadot-sdk/issues/10282
- https://github.com/paritytech/polkadot-sdk/issues/10283
- https://github.com/paritytech/polkadot-sdk/issues/10273

The chain is running `stable2509-2`, a release without our latest optimizations aiming to improve collator to validator stability:
- https://github.com/paritytech/polkadot-sdk/pull/10305
- https://github.com/paritytech/polkadot-sdk/pull/10362


### Versi

We have enabled elastic scaling on a YAP parachain in versi (our dedicated test net).
For the YAP parachian, we have been running stress testing by sending 2k transactions periodically.
In Versi, we have observed stable 2s block times despite the transaction spamming.

However, we have observed the following behavior which was not expected in test-nets:
- Collator to validator connectivity shows periodic instability
- Collation fetch latency spikes from sub 10ms to 2000/4000+ms in various cases

Despite the connection stability, the elastic scaling feature can produce:
- stable 2s blocks for 3 cores
- stable 600ms blocks for 12 cores (with PR applied: https://github.com/paritytech/polkadot-sdk/pull/10154)


For more details check:
- https://github.com/paritytech/polkadot-sdk/issues/10310

Identified the following optimization while debugging the chain:
- https://github.com/paritytech/polkadot-sdk/pull/10362


### Paseo

While asset hub westend issues are being resolved, we have decided to deploy a new chain in Paseo:
- https://github.com/paseo-network/passet-hub/pull/30

The chain is running a patched version of origin/master, including the following:
- https://github.com/paritytech/polkadot-sdk/pull/10305
- https://github.com/paritytech/polkadot-sdk/pull/10362
- https://github.com/paritytech/polkadot-sdk/pull/10154
- litep2p:  https://github.com/paritytech/litep2p/pull/480
- litep2p: https://github.com/paritytech/litep2p/pull/478

The chain is deployed using 3 collators manually in our cloud instances.

The chain is able to sustain ~2s blocks on average (mainly between 2s-3s), with occasional spikes of 18 blocks.

<img width="1555" height="490" alt="Image" src="https://github.com/user-attachments/assets/ff77c2bf-e896-4f19-8f0f-f7ef0804375c" />


Per 24h, we see around 571 warnings per collator: 

```
WARN parachain::collator-protocol: [Parachain] Collation wasn't advertised to any validator.
```

The authoring adjustment is working as expected:

```
17:24:01.474 DEBUG tokio-runtime-worker aura::cumulus: [Parachain] Adjusted proposal duration. duration=Some(1.526s)
17:24:03.405 DEBUG tokio-runtime-worker aura::cumulus: [Parachain] Adjusted proposal duration. duration=Some(1.595s)
17:24:05.407 DEBUG tokio-runtime-worker aura::cumulus: [Parachain] Adjusted proposal duration. duration=Some(593.137639ms)
```

The connection between collators and validators is still not ideal, accompanied by occasional collation fetch latency spikes:

<img width="736" height="308" alt="Image" src="https://github.com/user-attachments/assets/42844d42-d6f2-49e9-8529-3774b3f739ee" />


### Conclusion from Test Nets

After reviewing the above, we have decided to move forward with enabling the elastic-scaling feature on Kusama.

We believe the recent connectivity issues are unrelated to the elastic scaling implementation. They appear to be a combination of networking conditions, validator setups, and possible race cases that delay node connectivity.


### Kusama

The following PR prepares elastic scaling with 3 cores on AssetHubKusama:
- https://github.com/polkadot-fellows/runtimes/pull/1018


### Polkadot

Asset Hub Polkadot will follow shortly after Kusama has been triaged.


## Key Improvements and Other Findings

- https://github.com/paritytech/polkadot-sdk/pull/10154
- https://github.com/paritytech/polkadot-sdk/pull/10362
- https://github.com/paritytech/litep2p/pull/480
- https://github.com/paritytech/litep2p/pull/478
- https://github.com/paritytech/polkadot-sdk/issues/10341



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ElasticScaling] Enable ElasticScaling on AHKusama and AHPolkadot #10425

Goal

Test Nets Triaging

Asset Hub Westend

Versi

Paseo

Conclusion from Test Nets

Kusama

Polkadot

Key Improvements and Other Findings

Sub-issues

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[ElasticScaling] Enable ElasticScaling on AHKusama and AHPolkadot #10425

Description

Goal

Test Nets Triaging

Asset Hub Westend

Versi

Paseo

Conclusion from Test Nets

Kusama

Polkadot

Key Improvements and Other Findings

Sub-issues

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions