Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat[venom]: common subexpression elimination pass #4241

Open
wants to merge 72 commits into
base: master
Choose a base branch
from

Conversation

HodanPlodky
Copy link
Collaborator

What I did

Created common subexpression elimination pass

How I did it

Created available expression analysis and common subexpressions elimination pass. The available expression analysis is done with consideration to effects which heavy inspired by multidimensional fencing.

How to verify it

Created tests

Commit message

feat[venom]: common subexpression elimination

Cute Animal Picture

Put a link to a cute animal picture inside the parenthesis-->

class _Expression:
first_inst : IRInstruction
opcode: str
operands : list[IROperand]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like this can be IROperand | _Expression

HodanPlodky and others added 2 commits September 16, 2024 18:56
Co-authored-by: Charles Cooper <cooper.charles.m@gmail.com>
…ar fix

Co-authored-by: Charles Cooper <cooper.charles.m@gmail.com>
Copy link

codecov bot commented Sep 17, 2024

Codecov Report

Attention: Patch coverage is 91.83673% with 20 lines in your changes missing coverage. Please review.

Project coverage is 90.08%. Comparing base (b3ea663) to head (994ab14).

Files with missing lines Patch % Lines
vyper/venom/analysis/available_expression.py 91.11% 15 Missing and 1 partial ⚠️
...r/venom/passes/common_subexpression_elimination.py 93.10% 2 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4241      +/-   ##
==========================================
- Coverage   91.40%   90.08%   -1.32%     
==========================================
  Files         112      114       +2     
  Lines       15927    16172     +245     
  Branches     2694     2751      +57     
==========================================
+ Hits        14558    14569      +11     
- Misses        935     1127     +192     
- Partials      434      476      +42     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Comment on lines 83 to 90
def get_depth(self) -> int:
max_depth = 0
for op in self.operands:
if isinstance(op, _Expression):
d = op.get_depth()
if d > max_depth:
max_depth = d
return max_depth + 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can probably be a cached property

_IDEMPOTENT_INSTRUCTIONS = ["log", "call", "staticcall", "delegatecall", "invoke"]


@dataclass
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set frozen=True? this will let use do things like cache properties more safely.

Comment on lines 55 to 58
# General case
for self_op, other_op in zip(self.operands, other.operands):
if not self_op.same(other_op):
return False
Copy link
Member

@charles-cooper charles-cooper Oct 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this is a performance problem. i think the way to get around it is to map all expressions to ids. then we can construct equivalence sets in (roughly) O(n) instead of O(n^2) time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this approach will make the code cleaner too


@dataclass
class _Expression:
first_inst: IRInstruction
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe just call this inst?


class AvailableExpressionAnalysis(IRAnalysis):
expressions: OrderedSet[_Expression] = OrderedSet()
inst_to_expr: dict[IRInstruction, _Expression] = dict()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

memory leak



class AvailableExpressionAnalysis(IRAnalysis):
expressions: OrderedSet[_Expression] = OrderedSet()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

memory leak


if changed:
for out in bb.cfg_out:
if out not in worklist:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we require uniqueness in the worklist then we should use OrderedSet. it has a pop() function which is O(1) LIFO pop.

_MIN_DEPTH = 2

UNINTERESTING_OPCODES = ["store", "param", "offset", "phi", "nop"]
_IDEMPOTENT_INSTRUCTIONS = ["log", "call", "staticcall", "delegatecall", "invoke"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be NONIDEMPOTENT_INSTRUCTIONS. also i think we should do it the other way. we have to explicitly mark an instruction as idempotent, otherwise it defaults to not-idempotent.

_MIN_DEPTH = 2

UNINTERESTING_OPCODES = ["store", "param", "offset", "phi", "nop"]
_IDEMPOTENT_INSTRUCTIONS = ["log", "call", "staticcall", "delegatecall", "invoke"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be NONIDEMPOTENT_INSTRUCTIONS. also i think we should do it the other way. we have to explicitly mark an instruction as idempotent, otherwise it defaults to not-idempotent.

self.inst_to_available = dict()
self.bb_outs = dict()

self.ignore_msize = not self._contains_msize()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we should generalize this. basically what this is saying is, "we don't need to care about making memory reads volatile when there is no msize". but it's not that clear from how the code is written. (e.g. maybe there are other write effects which we don't need to preserve if there is no subsequent read).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants