Add replay-witness-test job to L2 CI workflow by avilagaston9 · Pull Request #6364 · lambdaclass/ethrex

avilagaston9 · 2026-03-13T18:22:17Z

Motivation

We need CI coverage to verify that ethrex-replay works against the current ethrex code. This catches breaking changes to ExecutionWitness, the debug_executionWitness RPC endpoint, or any VM/blockchain APIs that ethrex-replay depends on. Follows up on #6356.

Description

Add a replay-witness-test job to the L2 CI workflow that:

Starts an L1 dev node using the existing docker image
Clones ethrex-replay and patches its ethrex dependencies to the current PR's local checkout using Cargo [patch]
Sends a test transaction via rex and replays the block containing it with --no-zkvm
Asserts the replay succeeds (state root matches)

The [patch] approach means ethrex-replay's Cargo.toml always points to ethrex main, but CI overrides the dependencies with the local checkout. No pinned SHAs or branch name assumptions.

Checklist

Updated STORE_SCHEMA_VERSION — N/A, CI-only change

patches its ethrex dependencies with the current PR's local checkout using Cargo [patch], then sends a transaction to a dev L1 node and replays the block to verify the execution witness end-to-end.

github-actions · 2026-03-13T18:23:12Z

🤖 Kimi Code Review

Review of PR #6364: Replay Witness Test CI Job

Security & Correctness Issues

Hard-coded private key in CI workflow (lines 1016-1017)
- The private key 0x850643a0224065ecce3882673c21f56bcf6eef86274cc21cadff15930b59fc8c is hard-coded in the workflow file
- This is a security risk as it's exposed in the repository history
- Recommendation: Use GitHub secrets or environment variables for sensitive data
Unvalidated transaction hash extraction (lines 1018-1021)
- The TX_HASH extraction uses head -1 without validation
- If the rex send command fails or outputs unexpected format, this could capture incorrect data
- Recommendation: Add validation to ensure the output is a valid transaction hash format
Race condition in transaction receipt polling (lines 1026-1032)
- Uses fixed 3-second sleep instead of proper polling with timeout
- Could fail if block production is slower than expected
- Recommendation: Implement proper polling with exponential backoff and timeout

Performance & Reliability Issues

Missing timeout for replay command (lines 1037-1041)
- The ethrex-replay command runs without timeout protection
- Could hang indefinitely if there are issues
- Recommendation: Add timeout wrapper or use GitHub Actions timeout-minutes more granularly
Inefficient dependency patching (lines 965-1000)
- The Cargo.toml patching uses a large block of hard-coded paths
- This is brittle and will break if crate structure changes
- Recommendation: Consider using a script to generate the patch dynamically or use workspace inheritance

Code Quality & Maintainability

Python dependency assumption (line 1031)
- Assumes Python3 is available for JSON parsing
- Recommendation: Use jq instead (likely already available in CI) or add explicit Python installation step
Missing cleanup steps
- No cleanup of /tmp/ethrex-replay directory after test completion
- Recommendation: Add cleanup step to prevent disk space issues
Inconsistent job naming
- Job name is "Replay Witness Test" but the actual test seems to be a basic transaction replay
- Recommendation: Rename to better reflect the test's purpose

Minor Issues

Redundant condition in final job (line 1055)
- The if condition is getting very long and hard to read
- Recommendation: Consider using a more concise approach or splitting into multiple lines
Missing error handling for docker commands
- No validation that docker load and docker compose up commands succeed
- Recommendation: Add || exit 1 or similar error handling

The workflow addition is generally well-structured, but addressing the security issue with the private key and the reliability issues with transaction polling should be prioritized.

Automated review by Kimi (Moonshot AI) · custom prompt

github-actions · 2026-03-13T18:24:09Z

🤖 Codex Code Review

High – Unverified remote binary execution in CI
File: .github/workflows/pr-main_l2.yaml
curl -L .../rex-linux-x86_64 downloads and executes a binary without checksum/signature verification. This is a supply-chain risk (runner code execution if release/CDN/repo is compromised).
Suggested change: pin artifact digest and verify before install (e.g., sha256sum -c), or build from source at a pinned commit.
High – Non-deterministic/unpinned external repo in required check
File: .github/workflows/pr-main_l2.yaml
git clone https://github.com/lambdaclass/ethrex-replay.git pulls default branch head. Required CI behavior can change independently of this PR, and compromise of that repo can alter gate behavior.
Suggested change: checkout a pinned commit/tag (git checkout <sha>), and update intentionally.
Medium – Flaky transaction/receipt handling can produce false CI failures
File: .github/workflows/pr-main_l2.yaml
TX_HASH=$(... | head -1) assumes first output line is hash; this is brittle if CLI output format changes.
File: .github/workflows/pr-main_l2.yaml
Single sleep 3 + one receipt fetch is race-prone; result can be null, causing Python failure at line 1029.
Suggested change: parse structured output if available (--json), validate hash format, and poll receipt with timeout/retries until non-null block number.

No Rust/EVM/state-transition code changed in this PR, so consensus/opcode/gas/state-trie/RLP correctness is not directly impacted here.

Automated review by OpenAI Codex · custom prompt

github-actions · 2026-03-13T18:24:19Z

🤖 Claude Code Review

Here is my review of PR 6364:

PR Review: Add `replay-witness-test` job to L2 CI workflow

The approach is sound: using Cargo [patch] to override ethrex dependencies in ethrex-replay with the local PR checkout is clean and avoids branch-name assumptions. The overall structure follows the existing CI patterns.

Bugs / Reliability Issues

Fragile sleep 3 for transaction receipt (line 1025)

The fixed 3-second sleep before requesting the receipt is a race condition. If the dev node is under any load or the block time is configured longer, eth_getTransactionReceipt returns {"result": null}, and the downstream python3 invocation will crash with an unhelpful TypeError: 'NoneType' object is not subscriptable.

The "Wait for L1 to be ready" step already has a well-structured polling loop — the same pattern should be used here:

MAX_TRIES=15
for i in $(seq 1 $MAX_TRIES); do
  BLOCK_NUMBER=$(curl -sf -X POST http://localhost:8545 \
    -H "Content-Type: application/json" \
    -d "{\"jsonrpc\":\"2.0\",\"method\":\"eth_getTransactionReceipt\",\"params\":[\"$TX_HASH\"],\"id\":1}" \
    | python3 -c "import sys,json; r=json.load(sys.stdin); d=r.get('result'); print(int(d['blockNumber'],16)) if d else exit(1)" 2>/dev/null) && break
  echo "Waiting for receipt... (attempt $i)"
  sleep 2
done

TX_HASH extracted without validation (line 1021)

rex send ... 2>&1 | head -1 returns the first line of combined stdout+stderr. If rex prints a warning or progress line before the hash, TX_HASH will contain non-hash content, and the subsequent eth_getTransactionReceipt call will silently return null (or an error), again crashing in python3 with no useful message. At minimum, validate the extracted value looks like a hex hash:

if ! [[ "$TX_HASH" =~ ^0x[0-9a-fA-F]{64}$ ]]; then
  echo "Failed to extract valid tx hash: '$TX_HASH'"
  exit 1
fi

Security

rex binary downloaded without checksum verification (line 977)

curl -L https://github.com/lambdaclass/rex/releases/download/v8.0.0/rex-linux-x86_64 -o /tmp/rex

The binary is downloaded and executed with no integrity check. If the GitHub release is tampered with or the download is MITM'd (unlikely with GitHub but still a supply-chain concern for a CI pipeline that runs on PRs), the binary runs in the pipeline with full access to $GITHUB_WORKSPACE and the Docker socket. A SHA-256 checksum check should follow the download:

echo "<expected-sha256>  /tmp/rex" | sha256sum --check

Note: The hardcoded private key on line 1017 is fine — it is a well-known dev genesis account used in a local-only ephemeral node, equivalent to Hardhat/Foundry's default accounts.

Performance

cargo build --release for a CI smoke test (line 1012)

A release build for ethrex-replay is significantly slower than a debug build. Since this job is purely a correctness/integration test (not a performance benchmark), cargo build (debug mode) would be sufficient and meaningfully faster. The binary path would change from ./target/release/ethrex-replay to ./target/debug/ethrex-replay.

Minor / Design Notes

ethrex-replay cloned from latest main (line 983)

Acknowledged in the PR description, but worth noting: if ethrex-replay's main branch has a breaking API change or build failure unrelated to this PR, the CI job will fail on this PR with a misleading error. Pinning to a tag (e.g., git clone --branch v1.2.3) or a specific commit SHA would make failures deterministic and attributable.

all-tests gate condition is asymmetric (line 1058)

The existing if: condition checks that several jobs are not skipped, and this PR correctly adds needs.replay-witness-test.result != 'skipped'. However, integration-test-tdx is listed in needs but is absent from the if: condition check (this is a pre-existing issue, not introduced here). The new replay-witness-test is handled correctly.

[patch] section indentation (lines 991–1009)

The heredoc content has leading whitespace (4 spaces before [patch....] and each line). Cargo accepts this for TOML, but it's slightly non-standard. More importantly, Cargo requires [patch] sections to be at the top level — leading spaces on the [patch....] header line may be rejected by stricter TOML parsers in future Cargo versions. It would be safer to strip the leading indentation from the heredoc body.

Summary

Severity	Item
Medium	`sleep 3` is a race — poll for receipt with retry logic
Medium	`rex` binary not checksum-verified
Low	`TX_HASH` not validated before use
Low	`cargo build --release` is slow; debug build suffices for CI
Info	ethrex-replay pinned to latest `main`
Info	TOML heredoc indentation

The core design (Cargo [patch], docker image reuse, dev node setup) is solid. Addressing the receipt polling race and the binary checksum are the most important fixes before merging.

Automated review by Claude (Anthropic) · custom prompt

Add replay-witness-test job to L2 CI workflow that clones ethrex-replay,

d8295ce

patches its ethrex dependencies with the current PR's local checkout using Cargo [patch], then sends a transaction to a dev L1 node and replays the block to verify the execution witness end-to-end.

github-actions bot assigned avilagaston9 Mar 13, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add replay-witness-test job to L2 CI workflow#6364

Add replay-witness-test job to L2 CI workflow#6364
avilagaston9 wants to merge 1 commit intorefactor/remove-keys-from-execution-witnessfrom
ci/replay-witness-test

avilagaston9 commented Mar 13, 2026

Uh oh!

github-actions bot commented Mar 13, 2026

Uh oh!

github-actions bot commented Mar 13, 2026

Uh oh!

github-actions bot commented Mar 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

avilagaston9 commented Mar 13, 2026

Uh oh!

github-actions bot commented Mar 13, 2026

🤖 Kimi Code Review

Review of PR #6364: Replay Witness Test CI Job

Security & Correctness Issues

Performance & Reliability Issues

Code Quality & Maintainability

Minor Issues

Uh oh!

github-actions bot commented Mar 13, 2026

🤖 Codex Code Review

Uh oh!

github-actions bot commented Mar 13, 2026

🤖 Claude Code Review

PR Review: Add replay-witness-test job to L2 CI workflow

Bugs / Reliability Issues

Security

Performance

Minor / Design Notes

Summary

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

PR Review: Add `replay-witness-test` job to L2 CI workflow