Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(levm): optimize op_shl #1841

Merged
merged 37 commits into from
Feb 3, 2025
Merged

perf(levm): optimize op_shl #1841

merged 37 commits into from
Feb 3, 2025

Conversation

dsocolobsky
Copy link
Contributor

@dsocolobsky dsocolobsky commented Jan 29, 2025

Motivation

Reviewing the benchmarks for the ERC20 contracts, a major bottleneck was the constant bitshifting we were making, since checked_shift_left is quite slow. This introduces certain precomputed values and small optimizations to make that faster.

Description

  • We have a table of a few common precomputed bitshift values for 1<<amount (units like wei, szabo, eth, etc. , addresses).
  • If the value is not precomputed, we check if it's a power of two (1<<amount) and do a safe pow() operation to avoid calling checked_shift_left.
  • Else it runs normally like it used to.

With this we go from a 12-10x difference with revm in the ERC20 benchmarks to only a ~2x more in par with other benchmarks. There's an obvious catch to this and it's that it's because we precomputed 1<<160, but this is used for calculating addresses commonly in Ethereum contracts, so it should be useful imho.

I'm obviously open to any discussions since this does add a bit of complexity and it's a tradeoff, but the code is pretty simple.

dsocolobsky and others added 30 commits January 21, 2025 17:44
with hardcoded bytecode for now
Using tx_report.output instead of current_call_frame.output
Now we run more preparation code for the VM; however without this change it wasn't executing correctly contracts where we call ourselves via STATICCALL or similar.
Copy link

github-actions bot commented Jan 29, 2025

| File                                                                                     | Lines | Diff |
+------------------------------------------------------------------------------------------+-------+------+
| /home/runner/work/ethrex/ethrex/crates/vm/levm/src/opcode_handlers/bitwise_comparison.rs | 279   | +47  |
+------------------------------------------------------------------------------------------+-------+------+

Total lines added: +47
Total lines removed: 0
Total lines changed: 47

Copy link

github-actions bot commented Jan 29, 2025

Copy link

github-actions bot commented Jan 29, 2025

Benchmark Results Comparison

PR Results

Benchmark Results: Factorial

Command Mean [ms] Min [ms] Max [ms] Relative
revm_Factorial 245.4 ± 3.5 241.2 251.4 1.00
levm_Factorial 903.4 ± 10.5 889.2 920.7 3.68 ± 0.07

Benchmark Results: Factorial - Recursive

Command Mean [s] Min [s] Max [s] Relative
revm_FactorialRecursive 1.451 ± 0.089 1.338 1.598 1.00
levm_FactorialRecursive 15.681 ± 0.033 15.634 15.736 10.81 ± 0.67

Benchmark Results: Fibonacci

Command Mean [ms] Min [ms] Max [ms] Relative
revm_Fibonacci 211.1 ± 0.6 210.4 212.2 1.00
levm_Fibonacci 898.3 ± 13.9 878.7 922.6 4.26 ± 0.07

Benchmark Results: ManyHashes

Command Mean [ms] Min [ms] Max [ms] Relative
revm_ManyHashes 8.7 ± 0.1 8.6 9.0 1.00
levm_ManyHashes 18.3 ± 0.1 18.1 18.6 2.10 ± 0.03

Benchmark Results: BubbleSort

Command Mean [s] Min [s] Max [s] Relative
revm_BubbleSort 3.196 ± 0.015 3.179 3.230 1.00
levm_BubbleSort 6.101 ± 0.034 6.070 6.170 1.91 ± 0.01

Benchmark Results: ERC20 - Transfer

Command Mean [ms] Min [ms] Max [ms] Relative
revm_ERC20Transfer 248.5 ± 1.4 246.8 251.5 1.00
levm_ERC20Transfer 537.7 ± 5.0 531.0 547.4 2.16 ± 0.02

Benchmark Results: ERC20 - Mint

Command Mean [ms] Min [ms] Max [ms] Relative
revm_ERC20Mint 142.4 ± 1.4 140.9 144.8 1.00
levm_ERC20Mint 348.9 ± 3.4 345.2 356.7 2.45 ± 0.03

Benchmark Results: ERC20 - Approval

Command Mean [s] Min [s] Max [s] Relative
revm_ERC20Approval 1.043 ± 0.008 1.035 1.055 1.00
levm_ERC20Approval 2.028 ± 0.025 1.998 2.079 1.94 ± 0.03

Main Results

Benchmark Results: Factorial

Command Mean [ms] Min [ms] Max [ms] Relative
revm_Factorial 235.1 ± 0.8 233.3 236.6 1.00
levm_Factorial 913.1 ± 12.0 894.4 935.9 3.88 ± 0.05

Benchmark Results: Factorial - Recursive

Command Mean [s] Min [s] Max [s] Relative
revm_FactorialRecursive 1.576 ± 0.092 1.431 1.674 1.00
levm_FactorialRecursive 15.610 ± 0.033 15.554 15.663 9.91 ± 0.58

Benchmark Results: Fibonacci

Command Mean [ms] Min [ms] Max [ms] Relative
revm_Fibonacci 208.1 ± 0.6 207.6 209.5 1.00
levm_Fibonacci 903.1 ± 20.5 886.7 954.3 4.34 ± 0.10

Benchmark Results: ManyHashes

Command Mean [ms] Min [ms] Max [ms] Relative
revm_ManyHashes 8.7 ± 0.1 8.7 8.9 1.00
levm_ManyHashes 18.4 ± 0.1 18.2 18.5 2.10 ± 0.02

Benchmark Results: BubbleSort

Command Mean [s] Min [s] Max [s] Relative
revm_BubbleSort 3.219 ± 0.018 3.196 3.249 1.00
levm_BubbleSort 6.065 ± 0.041 6.026 6.133 1.88 ± 0.02

Benchmark Results: ERC20 - Transfer

Command Mean [ms] Min [ms] Max [ms] Relative
revm_ERC20Transfer 251.7 ± 1.9 248.5 253.9 1.00
levm_ERC20Transfer 3224.3 ± 28.1 3180.9 3253.3 12.81 ± 0.15

Benchmark Results: ERC20 - Mint

Command Mean [ms] Min [ms] Max [ms] Relative
revm_ERC20Mint 141.8 ± 0.5 140.8 142.5 1.00
levm_ERC20Mint 1689.8 ± 12.1 1678.8 1709.8 11.92 ± 0.10

Benchmark Results: ERC20 - Approval

Command Mean [s] Min [s] Max [s] Relative
revm_ERC20Approval 1.054 ± 0.011 1.040 1.073 1.00
levm_ERC20Approval 11.338 ± 0.086 11.253 11.498 10.76 ± 0.14

@dsocolobsky dsocolobsky changed the title perf(levm): Optimize op_shl perf(levm): optimize op_shl Jan 29, 2025
@dsocolobsky dsocolobsky force-pushed the levm/optimize-bitshifts branch from cadc789 to 1d5b416 Compare January 30, 2025 14:01
@dsocolobsky dsocolobsky marked this pull request as ready for review January 30, 2025 14:23
@dsocolobsky dsocolobsky requested a review from a team as a code owner January 30, 2025 14:23
Copy link
Contributor

@fborello-lambda fborello-lambda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, seems to be a good and efficient solution (based on the benchmarks, we have a considerable improvement). We may have to test it more thoroughly.

Copy link
Contributor

@ilitteri ilitteri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Fix the CI and we can merge it

Copy link

github-actions bot commented Feb 3, 2025

@dsocolobsky dsocolobsky force-pushed the levm/optimize-bitshifts branch from 99b1389 to dfadd07 Compare February 3, 2025 14:31

// Comparison and Bitwise Logic Operations (14)
// Opcodes: LT, GT, SLT, SGT, EQ, ISZERO, AND, OR, XOR, NOT, BYTE, SHL, SHR, SAR

static SHL_PRECALC: LazyLock<HashMap<u8, U256>> = LazyLock::new(|| {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this need to be static? Can't it be const?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

afaik for Hashmaps you need static (for instance https://users.rust-lang.org/t/is-there-a-way-to-create-a-constant-map-in-rust/8358) but if you have an idea I can try it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be too hard to test using a match instead of a hashmap? We can still return Some/None according to the input. Since we're benchmarking here, I'd like to see if we can squeeze a bit more of performance here. But this is just a nit. If it is too much trouble, don't bother.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just checked with a match and whilst the code is a bit simpler and more direct; benchmarking I get that this current version is faster, so I dunno what we prefer in this situation.


// Comparison and Bitwise Logic Operations (14)
// Opcodes: LT, GT, SLT, SGT, EQ, ISZERO, AND, OR, XOR, NOT, BYTE, SHL, SHR, SAR

static SHL_PRECALC: LazyLock<HashMap<u8, U256>> = LazyLock::new(|| {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a comment here stating why we are doing this, also add a permalink to this PR with the rationale of using it, if we ever want to undo this change.

@fkrause98 fkrause98 added this pull request to the merge queue Feb 3, 2025
Merged via the queue into main with commit aa85d03 Feb 3, 2025
18 checks passed
@fkrause98 fkrause98 deleted the levm/optimize-bitshifts branch February 3, 2025 21:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants