Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Block concurrency issues #646

Open
Nashtare opened this issue Sep 19, 2024 · 4 comments
Open

Block concurrency issues #646

Nashtare opened this issue Sep 19, 2024 · 4 comments
Assignees

Comments

@Nashtare
Copy link
Collaborator

I've noticed some kind of block concurrency issues in limited environments (i.e. no unlimited horizontal scaling), which can have some non-negligible performance impact overall. It may be due to #600.

I proved a payload of 20 contiguous blocks with a t2d-60 instance running 12 simulated workers (in-memory runtime). Because of the limit on parallel block proving, we need to have a block fully proven before adding a new one to the queue. However, because block proofs rely on the previous one, this becomes purely sequential, meaning we need block 1 to finish before kicking off segment proofs for block 17, etc...

I've attached the logs below. If we grep everything related to block 1:

$ cat b1_20.log| grep 'id="b1 '
2024-09-19T13:34:57.225907Z  INFO p_gen: zero::prover_state: using on demand circuit ProverStateManager { circuit_config: CircuitConfig { circuits: [16..21, 8..21, 8..21, 4..20, 8..17, 4..21, 17..24, 16..23, 7..23] }, persistence: Disk(OnDemand) } id="b1 - 0_0 (0)"
2024-09-19T13:34:57.276424Z  INFO p_gen: evm_arithmetization::generation::state: CPU halted after 11092 cycles     id="b1 - 0_0 (0)"
2024-09-19T13:34:57.276735Z  INFO p_gen: evm_arithmetization::generation: CPU trace padded to 16384 cycles     id="b1 - 0_0 (0)"
2024-09-19T13:35:00.468043Z  INFO p_gen: evm_arithmetization::witness::traces: Trace lengths (before padding): TraceCheckpoint { arithmetic_len: 1705, byte_packing_len: 136, cpu_len: 16384, keccak_len: 696, keccak_sponge_len: 29, logic_len: 161, memory_len: 97894 }, mem_before_len: 66049, mem_after_len: 0     id="b1 - 0_0 (0)"
2024-09-19T13:36:05.206126Z  INFO p_gen: zero::prover_state: using on demand circuit ProverStateManager { circuit_config: CircuitConfig { circuits: [16..21, 8..21, 8..21, 4..20, 8..17, 4..21, 17..24, 16..23, 7..23] }, persistence: Disk(OnDemand) } id="b1 - 0_0 (0)"
2024-09-19T13:36:05.224921Z  INFO p_gen: evm_arithmetization::generation::state: CPU halted after 13466 cycles     id="b1 - 0_0 (0)"
2024-09-19T13:36:05.225135Z  INFO p_gen: evm_arithmetization::generation: CPU trace padded to 16384 cycles     id="b1 - 0_0 (0)"
2024-09-19T13:36:05.638000Z  INFO p_gen: evm_arithmetization::witness::traces: Trace lengths (before padding): TraceCheckpoint { arithmetic_len: 2051, byte_packing_len: 169, cpu_len: 16384, keccak_len: 744, keccak_sponge_len: 31, logic_len: 175, memory_len: 103964 }, mem_before_len: 66049, mem_after_len: 0     id="b1 - 0_0 (0)"
2024-09-19T13:36:25.986594Z  INFO p_gen: zero::ops: segment proof (Dummy) took 88.760709993s id="b1 - 0_0 (0)"
2024-09-19T13:36:59.691013Z  INFO p_gen: zero::ops: segment proof (Dummy) took 54.484886601s id="b1 - 0_0 (0)"

And later when proof for block 1 is complete and we add block 17 to the queue:

2024-09-19T13:39:34.592868Z  INFO zero::prover: Successfully proved block 1
2024-09-19T13:39:34.592938Z  INFO zero::prover: Proving block 17

There is a delta of 2min35sec during which only aggregation proofs have been performed for this block (which are orders of magnitude faster).

An easy way to deal with this is just to increase the pool size (#644 makes it possible) but this goes against the initial purpose of this feature. Ideally we'd want to free the queue as soon as all the segments have been computed, but this seems a bit hacky.

b1_20.log

@Nashtare Nashtare added this to the Performance Tuning milestone Sep 19, 2024
@atanmarko
Copy link
Contributor

We will refactor proving concurrency logic for the #627. Then we can rethink about this limitation.

@atanmarko atanmarko self-assigned this Sep 19, 2024
@Nashtare
Copy link
Collaborator Author

I think we'll soon need some actual benchmarking criteria in CI, because the default parameters here are impacting negatively perfs, and we'd ideally not introduce any noticeable regression in develop / main.

@Nashtare
Copy link
Collaborator Author

Nashtare commented Oct 1, 2024

@atanmarko Just as a heads-up, if we can't address it prior to the next release, I'll bump the default pool size to a larger one to prevent any perf regression.

@atanmarko
Copy link
Contributor

@Nashtare I plan to start with #627 end of this week or next week

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Todo
Development

No branches or pull requests

2 participants