Skip to content

Implement a compile context/subcommand for standalone compilation mode#239

Open
elle-j wants to merge 29 commits intomainfrom
lj/compile-context
Open

Implement a compile context/subcommand for standalone compilation mode#239
elle-j wants to merge 29 commits intomainfrom
lj/compile-context

Conversation

@elle-j
Copy link
Contributor

@elle-j elle-j commented Feb 20, 2026

Summary

Adds support for a new compile subcommand for compiling contracts in standalone/pre-link mode (without any test execution).

Additions overview:

  • New Compile context
  • New core compilations entry point and driver
  • New PreLinkCompilationSpecificReporter and PreLinkCompilationReport

Notes:

  • Compilation will always happen with -Oz in this PR.
    • Next PR will support opt modes via CLI args after fixing an issue with handling of resolc's mode.

Example run command:

retester compile \
    --compile /path/to/resolc-compiler-tests/fixtures/solidity/simple \
    --compile /path/to/resolc-compiler-tests/fixtures/solidity/complex \
    --resolc.path /path/to/resolc \
    --solc.version "0.8.33" \
    --report.file-name report-compile.json \
    --working-directory ./workdir \
    --concurrency.number-of-threads 10 \
    --concurrency.number-of-concurrent-tasks 100 \
    > logs-compile.log \
    2> output-compile.log

Example report:

Expand to see an example report
{
  "context": {
    "Compile": {
      // ...
    }
  },
  "metadata_files": [
    // ...
  ],
  "execution_information": {
    "/path/to/solidity/simple/call_chain/address_size1.sol": {
      "compilation_reports": {
        "Y Mz S+": {
          "status": {
            "status": "Success",
            "compiled_contracts_info": {
              "/path/to/solidity/simple/call_chain/address_size1.sol": {
                "TestA": {
                  "bytecode_hash": "0x694498b0...",
                  "requires_linking": false
                },
                "TestB": {
                  "bytecode_hash": "0x1fc8c569...",
                  "requires_linking": false
                }
                // ...
              }
            }
            // ...
          }
        }
      }
    },
    "/path/to/solidity/simple/immutable_evm/trycatch.sol": {
      "compilation_reports": {
        "Y Mz S+": {
          "status": {
            "status": "Failure",
            "reason": "Encountered an error..."
            // ...
          }
        }
      }
    },
    "/path/to/solidity/complex/defi/starkex-verifier/test.json": {
      "compilation_reports": {
        "Y Mz S+": {
          "status": {
            "status": "Ignored",
            "reason": "Source pragma is incompatible with the Solidity compiler version.",
            "compiler_version": "0.8.33",
            "incompatible_files": [
              {
                "source_path": "/path/to/solidity/complex/defi/starkex-verifier/FriTransform.sol",
                "pragma": "^0.6.12"
              }
              // ...
            ]
          }
        }
      }
    }
    // ...
  }
}

This removes the need for a lot of duplicate code. If adding support for
a new specifier, instead of having to add 4 new macros that share almost
identical logic with the other macros, only 1 match arm needs to be
added to one existing macro.

This refactor also decouples the event's specifier field name from the
reporter's specifier field name, allowing them to be named differently
if needed. I think this also increases the readability somewhat of the
macros, showing more clearly what it matches on.
Copy link
Contributor Author

@elle-j elle-j Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file uses the same pattern as the differential tests entry point.

let compilation_definitions = create_compilation_definitions_stream(
&full_context,
&corpus,
// TODO (temporarily always using `z`): Accept mode(s) via CLI.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will be addressed in the next PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The macros here have been refactored in order to remove duplicate code. If adding support for a new specifier (as in this PR), only 1 match arm now needs to be added to one existing macro (avoiding the need to add 4 new macros that share almost identical logic with the other macros).

This refactor also decouples the event's specifier field name from the reporter's specifier field name, allowing them to be named differently if needed. I think this also increases the readability somewhat of the macros, showing a bit more clearly what it matches on.

use serde::{Deserialize, Serialize};

#[derive(Clone, Debug, PartialEq, Eq, PartialOrd, Ord, Hash)]
pub enum ParsedCompilationSpecifier {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this file rather than the TestSpecifier? They seem to be functionally the same but the test specifier seems to be more permissive in what it allows us to point to.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grouped my response here to some related comments such as that ☝️ one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed this and I agree with you that your approach here is better. Will mark this as resolved.

default_value = "",
value_hint = ValueHint::DirPath,
)]
pub working_directory: WorkingDirectoryConfiguration,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest that we remove the working directory from here. This is because we use it for 1) node files and 2) cached compiler artifacts. For this sub-command we don't use nodes and never want to make use of the cached compiler.

Copy link
Contributor Author

@elle-j elle-j Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also used for the report location, so since the report currently is added to the working directory, I included that in this config. Do you suggest adding the report somewhere else in the case of the compile command?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good point :D I suppose it's okay for the working directory to be provided and we could just invalidate the cached compiler always for the compilation runs.

},

/// An event sent by the reporter once an entire metadata file and mode combination has
/// finished standalone compilation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we refer in the codebase to standalone compilation as pre-link compilation? I think that it follows the same naming convention we've been using in the codebase.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah for sure 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


/// This is a full description of a compilation to run alongside the full metadata file
/// and the specific mode to compile with.
pub struct CompilationDefinition<'a> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... i'm a bit worried about there being a lot of copied code in this file and the fact that we could then have code divergences between these two files. Is there anything we can do about that? Also, we've just had our first code diversion with the pragma solidity check which we've not done so far because we rely on the metadata files containing modes which version of the compiler to use.

Copy link
Contributor Author

@elle-j elle-j Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll group my response here to some comments that I think are related. That one ☝️ , this and this (ParsedCompilationSpecifier vs ParsedTestSpecifier), and this (CorpusCompilationConfiguration vs CorpusExecutionConfiguration):

Currently, the ParsedTestSpecifier is test execution specific (referring to cases which is not needed in standalone/prelink). The way I see the usage differ from the current test/bench features is that the cases (and their modes) in the metadata files are intended for execution, so the preexisting ParsedTestSpecifier and CorpusExecutionConfiguration make sense, inferring the opt modes from that.

However, for the (standalone/prelink-only) compile command, it could be good to allow passing the modes as CLI args (saying "all compatible files should be compiled with these opt levels"), rather than having to rely on the levels indicated in the metadata files. So if we want to compile with other levels, we don't need to update metadata files, which then modifies what the test/bench features test (and vice versa). The metadata files logic is still used to e.g. find the related files to compile. (Accepting the opt levels via CLI args would be the next PR in that case btw, this PR uses only 1, acting as if the user only provided 1 level.)

I'd love to share more logic between the features, though, where suitable.

Having the compile feature determine modes in this way (not via the metadata files), leads to the divergence you're speaking of with the pragma solidity check. What are your thoughts on this difference in usage?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding what we briefly discussed offline about potentially using case idx 0 for standalone comp mode, I'd opt for not going with that approach. Essentially because it conflates unrelated concepts and introduces irrelevant data that we need to remember to interpret differently from what would be semantically assumed (e.g. a “case”, whether using a hardcoded index or not, is ambiguous in a standalone compilation context). I think it'd alleviate making test-specific or compile-only specific changes if it's a clearer separation between the two.

Claude and I looked more into possible ways to share more of the code where suitable, where one thing could be e.g. the orchestration logic (concurrent (test and comp) or sequential (bench)).

I'd say that such a refactor could in that case be done in a follow-up (even tho it may be small) once we get this feature out and can start being used by the resolc repo. Thoughts?

Copy link
Contributor

@0xOmarA 0xOmarA Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After our discussion, I get why this struct was added and the overall abstraction that you're after. Also, I understand not wanting to do the "case index 0" approach, to be honest it was somewhat of a hacky approach but it was a way to reuse a lot of what we had.

I'd recommend that we would want to reuse more code in this PR since it's already quite a small change to make and there's no need to delay it until a later PR.

I imagine that code-reuse for this could look something like the following:

pub struct SomeDefinition<'a, AdditionalState> {
    // There are fields which are shared by the test definitions and the compilation definitions.
    shared_field1: SomeType,
    shared_field2: SomeOtherType,
    // These are fields which are different between the test case definition and the compilation definition.
    additional_state: AdditionalState
}

Then, we can specialize the implementation of check_compatibility based on which type of state is available

impl<'a> SomeDefinition<'a, TestAdditionalState> {
    pub fn check_compatability(&self) -> Result<()> {
        self.check_compatability_of_shareed_fields()?;
        todo!()
    }
}

impl<'a> SomeDefinition<'a, CompilationAdditionalState> {
    pub fn check_compatability(&self) -> Result<()> {
        self.check_compatability_of_shareed_fields()?;
        todo!()
    }
}

impl<'a, AdditionalState> SomeDefinition<'a, AdditionalState> {
    pub fn check_compatability_of_shareed_fields(&self) -> Result<()> {
        todo!()
    }
}

Additionally, we could implement Deref and Deref mut for this which would mean that to parts of the code using these structs it would look as if their structure was the same since we we're still able to access fields on the additional_state with the syntactic sugar of it looking like we're not accessing it. This is nice because it minimizes how large this refactor can be.

Copy link
Contributor Author

@elle-j elle-j Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, okay good idea 🙂

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the code to share the orchestration logic which is beneficial. I've further evaluated the sharing of a potential SomeSharedDefinition but I'd say that the added code is likely not necessary, because:

  • The only shared fields as of now would be the ones below. No method logic is actually shared.
pub metadata: &'a MetadataFile,
pub metadata_file_path: &'a Path,
pub mode: Cow<'a, Mode>,
pub reporter: Reporter, // via type param
  • If we were to use a shared struct with e.g. a field for additional context-specific state:
    • Using Deref/DerefMut to avoid the extra field access seems to be an anti-pattern when used for non-smart-pointer types (ref1, ref2, ref3).
    • Not using Deref/DerefMut will add the extra field access.

I think having the separate definitions is more favorable at this time, but implementing the shared struct would be quick, so let me know if you still would prefer that approach.

Comment on lines -144 to -153
if deployed_libraries.is_some() {
reporter
.report_post_link_contracts_compilation_succeeded_event(
compiler.version().clone(),
compiler.path(),
true,
None,
cache_value.compiler_output.clone(),
)
.expect("Can't happen");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dead code, we're already in a match deployed_libraries > None case.

@0xOmarA
Copy link
Contributor

0xOmarA commented Feb 26, 2026

It's looking a lot better, can we get more code re-use before we merge?

@elle-j elle-j changed the title [WIP] Implement a compile context/subcommand for standalone compilation mode Implement a compile context/subcommand for standalone compilation mode Mar 2, 2026
@elle-j elle-j marked this pull request as ready for review March 2, 2026 18:26
@elle-j elle-j requested review from 0xOmarA and xermicus March 2, 2026 18:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants