Skip to content

Conversation

@WLUB
Copy link

@WLUB WLUB commented Feb 12, 2024

Analyzer/Checker draft

This is a draft for the analyzer/checker described in issue #6
Only some of the checks are implemented into this version.

Implementation

malVisitor in mal_visitor.py will use an analyzer malAnalyzerInterface. The visit method is overridden in malVisitor in order to call the checker-methods in the malAnalyzerInterface (if defined).

Analyzer implementation

Language Constituents

  • Define
  • Include
  • Category
  • Asset
  • Extends
  • Abstract asset
  • User info
  • Developer info
  • Modeler info
  • Let Partially implemented
  • Association Partially implemented

MAL Symbols

  • Leads to (->) Partially implemented
  • Append (+>) Partially implemented
  • OR-step (|) Partially implemented
  • AND-step (&) Partially implemented
  • Defense (#) Partially implemented
  • Existence (E) Partially implemented
  • Non-existence (!E) Partially implemented
  • Require (<-) Partially implemented
  • TypeOf ([]) Partially implemented
  • Union (/) Partially implemented
  • Intersection (/) Partially implemented
  • Difference (-) Partially implemented
  • Collect operator (X.A) Partially implemented
  • Transitive operator (X*.A) Partially implemented
  • Let substitution (X().A) Partially implemented
  • Hidden step
  • Debug step
  • Trace origin
  • Risk type Partially implemented

Note

  • Can an Asset with the same name be defined twice?
  • Can we use a same include more then once? (Risk of circular dependency)
  • Should the state of the analyzer be synchronous between all files when using include?
  • Should we allow two detectors on the same asset with the same name?
  • Should we allow an empty detector context?
  • Should we allow assets in the detector context that do not exist?

Copy link
Contributor

@nkakouros nkakouros left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's great that we now have an implementation; it will help with deciding which approach works best. You also seem to have a good grip on the language and what needs to be checked. Here are some comments:

The visit method is the operational, the visiting method. It is supposed to be used when visiting a context (instead of the implementing method, e.g.visitMal or visitStep). Checking the depth in visit to kickstart the analyzer works but it goes against the single responsibility principle. I think a better place would have been the visitMal method that already operates on depth 1.

The use of the malAnalyzer.analyze decorator is clever. But does it provide significant savings? Instead of having a super()... line, we now have a decorator line for each overridden method and a decorator class definition that makes calling the parent's implementation more obscure.

I also think that for the mal visitor to inherit from the analyzer seems a bit weird. On one hand conceptually it makes sense to me to be the other way around. On the other, some operations need to be replicated this way. For example, in visitDefine in the analyzer class you have:

define_id: str = ctx.ID().getText()

The same code exists in the visitDefine method of the mal visitor class. This should not affect performance but it makes the code repeatable.

Instead, you could have the analyzer inherit from the mal visitor and when overriding a method to add checks to it you could do:

result = super().visitDefine(...)

_, define = result  # result is a tuple with `"defines"` as the first item and a single-item dict as the second

if next(iter(define.values())) is None:  # single item in the `define` dict
  # fail here due to empty define value

I think this would look cleaner. We had discussed it as a potential approach in an email exchange, but seeing a first implementation now I think it makes more sense to do it this way.

Also, as you have seen, some checks about an entity, e.g. defines, may not be possible to place under its visit method. For instance, to check if there is a define that gives an id we need to check the whole mal spec. As mentioned above, such checks could run in the visitMal method of the analyzer class that extends the mal visitor, as this visitor method is at depth 1.

An alternative could be to just use the mal visitor as is, get a tree out of it and then just apply the checks on the tree (the dict that contains all the parsed nodes). But we will not be able to stop compiling as soon as an error can be detected (for instance, the moment when we see that a define does not have a value).

A slight problem with having the analyzer inherit from the visitor is that if you want to compile sth programmatically, you need to use the analyzer class which is counter-intuitive. To counter that, we could use dependency injection instead of inheritance, where we pass an Analyzer object to the visitor and the visitor uses it where it makes sense, as described above. I think this would be the cleanest approach. E.g. in malVisitor's visitDefine we would call self.analyzer.check_empty_define(the_value_to_be_returned_here). Or we could keep all checks in the visitMal method:

        for declaration in (d.getChild(0) for d in ctx.declaration()):
            if result := self.visit(declaration) or True:
                key, value = result

                if key == "categories":
                    ...

                if key == "defines":
                    self.analyzer.check_emtpy_define(value)
                    # or self.analyzer.check_define(value)  # where multiple checks on the single define can happen
                    # and a self.analyzer.check_defines() method would run later in visitMal once
                    # the whole tree is available to check sth on the whole set of defines.

                    langspec[key].update(value)

I see now you have highlighted some of these issues in your original comment as well.

Copy link
Contributor

@nkakouros nkakouros left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am thinking, instead of polluting all visitX methods with calls to the analyzer, why don't we just override visit and dynamically build the analyzer's method name that needs to be called based on the ctx visited?

Comment on lines 3 to 7
from .mal_analyzer import malAnalyzerInterface

from antlr4 import ParseTreeVisitor
from collections.abc import MutableMapping, MutableSequence

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put type checking dependencies behind a check:

from typing import TYPE_CHECKING

if TYPE_CHECKING:
    import some_module

@nkakouros nkakouros closed this Jul 11, 2024
@nkakouros nkakouros reopened this Aug 7, 2024
@tagyieh tagyieh mentioned this pull request Feb 4, 2025
@nkakouros
Copy link
Contributor

Superseded by #111.

@nkakouros nkakouros closed this Feb 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants