-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf(levm): optimize op_shl #1841
Conversation
with hardcoded bytecode for now
Using tx_report.output instead of current_call_frame.output
Now we run more preparation code for the VM; however without this change it wasn't executing correctly contracts where we call ourselves via STATICCALL or similar.
|
|
Benchmark Results ComparisonPR ResultsBenchmark Results: Factorial
Benchmark Results: Factorial - Recursive
Benchmark Results: Fibonacci
Benchmark Results: ManyHashes
Benchmark Results: BubbleSort
Benchmark Results: ERC20 - Transfer
Benchmark Results: ERC20 - Mint
Benchmark Results: ERC20 - Approval
Main ResultsBenchmark Results: Factorial
Benchmark Results: Factorial - Recursive
Benchmark Results: Fibonacci
Benchmark Results: ManyHashes
Benchmark Results: BubbleSort
Benchmark Results: ERC20 - Transfer
Benchmark Results: ERC20 - Mint
Benchmark Results: ERC20 - Approval
|
cadc789
to
1d5b416
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, seems to be a good and efficient solution (based on the benchmarks, we have a considerable improvement). We may have to test it more thoroughly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Fix the CI and we can merge it
|
…ex into levm/optimize-bitshifts
99b1389
to
dfadd07
Compare
|
||
// Comparison and Bitwise Logic Operations (14) | ||
// Opcodes: LT, GT, SLT, SGT, EQ, ISZERO, AND, OR, XOR, NOT, BYTE, SHL, SHR, SAR | ||
|
||
static SHL_PRECALC: LazyLock<HashMap<u8, U256>> = LazyLock::new(|| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this need to be static? Can't it be const?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
afaik for Hashmaps you need static (for instance https://users.rust-lang.org/t/is-there-a-way-to-create-a-constant-map-in-rust/8358) but if you have an idea I can try it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be too hard to test using a match instead of a hashmap? We can still return Some/None according to the input. Since we're benchmarking here, I'd like to see if we can squeeze a bit more of performance here. But this is just a nit. If it is too much trouble, don't bother.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just checked with a match and whilst the code is a bit simpler and more direct; benchmarking I get that this current version is faster, so I dunno what we prefer in this situation.
|
||
// Comparison and Bitwise Logic Operations (14) | ||
// Opcodes: LT, GT, SLT, SGT, EQ, ISZERO, AND, OR, XOR, NOT, BYTE, SHL, SHR, SAR | ||
|
||
static SHL_PRECALC: LazyLock<HashMap<u8, U256>> = LazyLock::new(|| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add a comment here stating why we are doing this, also add a permalink to this PR with the rationale of using it, if we ever want to undo this change.
Motivation
Reviewing the benchmarks for the ERC20 contracts, a major bottleneck was the constant bitshifting we were making, since
checked_shift_left
is quite slow. This introduces certain precomputed values and small optimizations to make that faster.Description
1<<amount
(units like wei, szabo, eth, etc. , addresses).1<<amount
) and do a safe pow() operation to avoid callingchecked_shift_left
.With this we go from a 12-10x difference with revm in the ERC20 benchmarks to only a ~2x more in par with other benchmarks. There's an obvious catch to this and it's that it's because we precomputed
1<<160
, but this is used for calculating addresses commonly in Ethereum contracts, so it should be useful imho.I'm obviously open to any discussions since this does add a bit of complexity and it's a tradeoff, but the code is pretty simple.