ci: Ensure the right cargo subcommands are picked #71
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
CI started to fail with errors while running
cargo clippy
with--ignore-environment
. Using this flag clears most of the environment variables, such as the system's $PATH, helping create more isolated builds. We want to avoid using system binaries, which might differ. For this reason, this failure was puzzling to me, most of the environment is under our precise control:cargo
,rustc
and others is defined in ournix
flake and we commit the lockfile. No upgrades happened between the last good and the first bad CI runs.There are two potential sources of non isolation left, (1) the base image that GitHub Actions runners use and (2) the
nix
version we run as we don't currently pin them (this will be fixed later on). Unfortunately, even pinning them to a version released before the breakage was introduced didn't fix the problem at hand.We were not convinced that (1) was the issue was we liberally use
--ignore-environment
which will prevent the environment getting tainted, but after some tests we saw this:with a clean environment
note the wrong
cargo-clippy
binary (/home/javierhonduco/.cargo/bin/cargo-clippy), provided by the system rather than bynix
.without a clean environment
which shows the expected path for
cargo-clippy
.After some more debugging it was clear that the main difference was in
$PATH
, as in the non isolated environment both thenix
provided binaries were added first, and then there was an entry for/home/javierhonduco/.cargo/bin
, which includes the system-provided binaries that are not used by cargo subcommands as the order of the search path is respected.This didn't seem to make a lot of sense until we found rust-lang/cargo#11023, which has this comment:
So, in summary, since we were cleaning $PATH and $CARGO_HOME/bin was not present anymore, even if $PATH contains all the right cargo subcommand binaries,
cargo
will fall back to executing binaries in ~/.cargo/bin.This debugging was complicated by the fact that we could not easily reproduce this issue locally, and that we expected no leakeage from the environment, but we need to be mindful that
nix
doesn't have a 'real' filesystem sandbox such as buck2 or bazel.Hence the initial issue was caused due to a different
cargo-clippy
binary that somehow caused this issue, which is not totally understood at the time:I don't have to dig deeper, but perhaps an upgrade of our Rust toolchain will show issues. Will keep an eye for this in case there are any bugs lying around.
I verified that there was a base image upgrade around the time the issue started to happen in
actions/runner-images@a68ad81.
Test Plan
CI