Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft: Implementation for LibraryElementHasher #597

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

FRoith
Copy link

@FRoith FRoith commented Oct 7, 2024

This change will add hashing functionality to the IDE, and automatically adds computed hash values to the output of the Forte NG exporter.

Copy link

Test Results

   100 files   - 10     100 suites   - 10   43s ⏱️ -5s
27 474 tests  - 57  27 474 ✅  - 57  0 💤 ±0  0 ❌ ±0 
27 475 runs   - 57  27 475 ✅  - 57  0 💤 ±0  0 ❌ ±0 

Results for commit 71e3705. ± Comparison against base commit 987566a.

This pull request removes 57 tests.
org.eclipse.fordiac.ide.test.export.forte_lua.st.ForteLuaStOperatorTest ‑ testAdd
org.eclipse.fordiac.ide.test.export.forte_lua.st.ForteLuaStOperatorTest ‑ testAnd
org.eclipse.fordiac.ide.test.export.forte_lua.st.ForteLuaStOperatorTest ‑ testCalc
org.eclipse.fordiac.ide.test.export.forte_lua.st.ForteLuaStOperatorTest ‑ testDivInt
org.eclipse.fordiac.ide.test.export.forte_lua.st.ForteLuaStOperatorTest ‑ testDivReal
org.eclipse.fordiac.ide.test.export.forte_lua.st.ForteLuaStOperatorTest ‑ testEq
org.eclipse.fordiac.ide.test.export.forte_lua.st.ForteLuaStOperatorTest ‑ testGt
org.eclipse.fordiac.ide.test.export.forte_lua.st.ForteLuaStOperatorTest ‑ testLe
org.eclipse.fordiac.ide.test.export.forte_lua.st.ForteLuaStOperatorTest ‑ testMod
org.eclipse.fordiac.ide.test.export.forte_lua.st.ForteLuaStOperatorTest ‑ testMul
…

Copy link
Contributor

@azoitl azoitl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The IHasherInterface, the extension point, and the helper for getting a hasher should be moved to the model plugin, best to the typelibrary. Because we will needed it for different use cases. I would even expect that the TypeEntry will cash the hash values as they are rather expensive to create.

Please also add the copyright headers to the new files. This is the reason the build didn't run through.

if (il.getVisibleOutputVars().size() > 0) {
parts.add("DATAOUTPUTS{"); //$NON-NLS-1$
for (final VarDeclaration vdec : il.getVisibleOutputVars()) { // TODO getVisibleOutputVars == DataOutputs?
parts.add(String.format("%s:%s", vdec.getName(), vdec.getTypeName())); //$NON-NLS-1$
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you not using your vardecl method here?

if (il.getVisibleInputVars().size() > 0) {
parts.add("DATAINPUTS{"); //$NON-NLS-1$
for (final VarDeclaration vdec : il.getVisibleInputVars()) { // TODO VisibleInputVars == DataInputs?
parts.add(String.format("%s:%s", vdec.getName(), vdec.getTypeName())); //$NON-NLS-1$
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you not using your vardecl method here?

if (il.getEventOutputs().size() > 0) {
parts.add("OUTPUTEVENTS{"); //$NON-NLS-1$
for (final Event event : il.getEventOutputs()) {
parts.add(String.format("%s:%s", event.getName(), event.getTypeName())); //$NON-NLS-1$
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you not using your event method here?

if (il.getEventInputs().size() > 0) {
parts.add("INPUTEVENTS{"); //$NON-NLS-1$
for (final Event event : il.getEventInputs()) {
parts.add(String.format("%s:%s", event.getName(), event.getTypeName())); //$NON-NLS-1$
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you not using your event method here?


private void process(final VarDeclaration vdec, final List<String> parts, final String version) {
// TODO handle LocalVariable or MemberVarDeclaration
parts.add(String.format("VDEC:%s:%s", vdec.getName(), //$NON-NLS-1$
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

data type name is missing

parts.add(String.format("ECC{")); //$NON-NLS-1$
parts.add(String.format("ECStates{")); //$NON-NLS-1$
for (final ECState ecs : ecc.getECState()) {
parts.add(ecs.getName());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actions are missing

@azoitl azoitl requested a review from ernstblechaPT October 13, 2024 14:46
@mx990 mx990 self-requested a review October 13, 2024 15:50
@mx990
Copy link
Member

mx990 commented Oct 13, 2024

Aside from what @azoitl has already mentioned, I have a few general remarks regarding your current approach:

  • You are hashing ST algorithms and methods by generating C++ code and using that for your hash. This means your hash:

    • is specific to the forte_ng export,
    • is sensitive to any future changes in the exported C++ code,
    • is sensitive to values of global constants in array ranges, which also makes it difficult to cache,
    • requires you to fully parse the ST code and generate C++ code, which makes it particularly expensive.

    I believe your hash is also currently sensitive to any added or removed lines in the ST code, since the generated C++ code contains #line directives (with the line number from the ST code). You would need to suppress them, for example, by overriding generateLineDirective(EObject) to return an empty string. Of course, that is easily done in this specific case, but something similar may come up again in the future and break your hash.

  • You are explicitly iterating over all relevant model elements for your hash. I believe this is quite fragile and error-prone, since this makes it easy to accidentially omit relevant information. This is particularly problematic when someone else may extend the model in the future, without realising the need to extend the hash alongside it.

I would thus like to propose a different approach:

  • use an XMI or binary EMF resource to generate the input for hashing,
  • use the plain ST code, stripped of any extra whitespace, pragmas, and comments, for the hash,
  • use meta-model annotations to specify features to be excluded from the hash.

I believe this has the following advantages over the current approach:

  • all features are included in the hash by default,
  • based on a known format instead of running our own,
  • exceptions can be specified declaratively on the meta-model,
  • follows any future changes to the model without significant additional work,
  • does not require parsing of the ST code.

@azoitl
Copy link
Contributor

azoitl commented Oct 14, 2024

@mx990 how is your xmi based approach not breaking any existing XMI export?

@mx990
Copy link
Member

mx990 commented Oct 15, 2024

@mx990 how is your xmi based approach not breaking any existing XMI export?

I believe the hashing implementation should be entirely independent of the XMI export, although it would certainly be quite similar in how it works. I would propose the following:

  • create custom annotations for the EMF meta-model [1] to:
    • exclude a feature from hashing (e.g., for comments, etc.),
    • specify a class for pre-processing elements (e.g., for ST algorithms to strip comments, etc.)
  • create a custom EcoreUtil.Copier that:
    • skips excluded features,
    • calls the pre-processor on copied elements if specified [2],
  • create a generic hasher that works similar to the XMI export [3]:
    • copy the type (with the custom Copier),
    • add the result into a plain org.eclipse.emf.ecore.xmi.impl.XMIResourceImpl,
    • save the resource into an appropriate OutputStream,
    • use the result for hashing.

[1] see org.eclipse.fordiac.ide.model.emf.FordiacMetaData
[2] see also org.eclipse.fordiac.ide.structuredtextcore.resource.LibraryElementXtextResource.ShallowCopier
[3] see org.eclipse.fordiac.ide.export.xmi.XMIExportFilter

@FRoith
Copy link
Author

FRoith commented Oct 16, 2024

I do really like the approach, I'll look into implementing it that way, and see if I hit any snags on the way. The only potential issue I see, is that I don't yet have an idea of how I would introduce detecting "old versions" of the hash with this approach, as that was one of the requested features, but I'll see what I can find.

@oberlehner
Copy link
Contributor

Please add unit tests.
This is really important, because we want to detect early if any implementation inside the IDE (e.g change in the parser) might potentially changes the hash values.

So please add test cases for every type that can hashed:
-SubappType
-SimpleType
-CFB
-SIFB

Are you hashing Structured Data Type ? i cannot find that in the code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants