Skip to content

Commit

Permalink
Fix improper capturing of mask into package
Browse files Browse the repository at this point in the history
We have observed packages getting built containing a file removal mask,
despite `test_package_with_circular_dep_does_not_collect_file_removals`
attempting to test correct behavior in this area. It turns out that a
specific setup is required to exhibit the incorrect behavior.

When the result of a build is being diffed to figure out what changes are
to be captured into the new package, the "environment_filesystem" Manifest
was being incorrectly calculated. If multiple layers in the Solution
contain entries for the same subdirectory, only the latest layer that
does so would be used to represent what the "before" contents were for that
directory.

Consider one package that puts a file into "subdir/" like "subdir/foo.txt",
and another package that simply creates an empty directory called
"subdir/". The order that layers end up appearing in a Solution is arbitrary,
so with this setup it is possible for the Manifest to end up with an empty
"subdir/" directory, whereas the true build environment would have
contained the "subdir/foo.txt" file entry.

In our environment we have a package called "stdfs" which contains an empty
"lib/" subdirectory. When this package's entries are walked, its empty
"lib/" directory would overwrite any contents that had already been
encounted under "lib/". In our problem case, that included a _deletion_ of
a subdirectory under "lib/".

Since the "before" contents ended up not containing the entry that was
deleted by the build, when diffing the before and after, the after would
include a diff event that "Added" a new mask entry, and its presence in
the diff would allow it to be collected into the newly built package.

One "fix" for this was to modify `diff_path` so it detects this
(unexpected) situation and ignore mask entries when the mask is deleting
something that didn't exist in the "before" Manifest. But this problem is
a symptom of the "before" Manifest being incorrect.

The proper fix was to change the "environment_filesystem" calculation to
merge two Trees when Trees with the same path appear in multiple layers.

In order to properly test the expected behavior, the
"environment_filesystem" calculation was refactored out into its own
method. It was moved to `ResolvedLayers` since practically none if its
logic was related to `BinaryPackageBuilder`.

The recursive package test was updated in such a way that it would actually
fail without these changes, however the problem ended up not being specific
to recursive packages. A new test was added to specifically test the Tree
merge behavior.

Fixes #1013.

Signed-off-by: J Robert Ray <jrray@jrray.org>
  • Loading branch information
jrray committed Jun 8, 2024
1 parent 25e15ad commit c1f0188
Show file tree
Hide file tree
Showing 9 changed files with 248 additions and 99 deletions.
7 changes: 7 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

11 changes: 11 additions & 0 deletions crates/spfs/src/tracking/diff.rs
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,8 @@ pub fn compute_diff<U1: Clone>(a: &Manifest<U1>, b: &Manifest<U1>) -> Vec<Diff<U
match diff_path(a, b, &entry.path) {
Some(d) => changes.push(d),
None => tracing::debug!(
// XXX this is not the only reason `diff_path` may return
// None.
"path was missing from both manifests during diff, this should be impossible"
),
}
Expand All @@ -149,6 +151,15 @@ fn diff_path<U1: Clone, U2: Clone>(
path: path.clone(),
}),

(None, Some(b_entry)) if b_entry.kind == EntryKind::Mask => {
debug_assert!(
false,
"detected a mask entry that deletes something that doesn't exist"
);
// Can't return `a` as documented since the left side is None.
None
}

(None, Some(e)) => Some(Diff {
mode: DiffMode::Added(e.clone()),
path: path.clone(),
Expand Down
80 changes: 7 additions & 73 deletions crates/spk-build/src/build/binary.rs
Original file line number Diff line number Diff line change
Expand Up @@ -11,11 +11,12 @@ use std::sync::Arc;
use futures::StreamExt;
use relative_path::RelativePathBuf;
use spfs::prelude::*;
use spfs::tracking::{DiffMode, EntryKind};
use spfs::tracking::DiffMode;
use spk_exec::{
pull_resolved_runtime_layers,
resolve_runtime_layers,
solution_to_resolved_runtime_layers,
ConflictingPackagePair,
};
use spk_schema::foundation::env::data_path;
use spk_schema::foundation::format::FormatIdent;
Expand All @@ -39,7 +40,6 @@ use spk_solve::graph::Graph;
use spk_solve::solution::Solution;
use spk_solve::{BoxedResolverCallback, Named, ResolverCallback, Solver};
use spk_storage as storage;
use tokio::pin;

use crate::report::{BuildOutputReport, BuildReport, BuildSetupReport};
use crate::validation::{Report, Validator};
Expand Down Expand Up @@ -79,11 +79,6 @@ pub enum BuildSource {
LocalPath(PathBuf),
}

/// A pair of packages that are in conflict for some reason,
/// e.g. because they both provide one or more of the same files.
#[derive(Eq, Hash, PartialEq)]
struct ConflictingPackagePair(BuildIdent, BuildIdent);

/// A struct with the two variants needed to calculate a build digest for a
/// package as well as build the package.
struct VariantPair<V1, V2> {
Expand Down Expand Up @@ -349,73 +344,12 @@ where
tokio::spawn(async move { Ok(resolved_layers_copy.layers()) })
};

let mut environment_filesystem = spfs::tracking::Manifest::new(
// we expect this to be replaced, but the source build for this package
// seems like one of the most reasonable default owners for the root
// of this manifest until then
spfs::tracking::Entry::empty_dir_with_open_perms_with_data(
let environment_filesystem = resolved_layers
.get_environment_filesystem(
self.recipe.ident().to_build(Build::Source),
),
);

// Warn about possibly unexpected shadowed files in the layer stack.
let mut warning_found = false;
let entries = resolved_layers.iter_entries();
pin!(entries);

while let Some(entry) = entries.next().await {
let (path, entry, resolved_layer) = match entry {
Err(spk_exec::Error::NonSpfsLayerInResolvedLayers) => continue,
Err(err) => return Err(err.into()),
Ok(entry) => entry,
};

let entry = entry.and_user_data(resolved_layer.spec.ident().to_owned());
let Some(previous) = environment_filesystem.get_path(&path).cloned() else {
environment_filesystem.mknod(&path, entry)?;
continue;
};
let entry = environment_filesystem.mknod(&path, entry)?;
if !matches!(entry.kind, EntryKind::Blob(_)) {
continue;
}

// Ignore when the shadowing is from different components
// of the same package.
if entry.user_data == previous.user_data {
continue;
}

// The layer order isn't necessarily meaningful in terms
// of spk package dependency ordering (at the time of
// writing), so phrase this in a way that doesn't suggest
// one layer "owns" the file more than the other.
warning_found = true;
tracing::warn!(
"File {path} found in more than one package: {} and {}",
previous.user_data,
entry.user_data
);

// Track the packages involved for later use
let pkg_a = previous.user_data.clone();
let pkg_b = entry.user_data.clone();
let packages_key = if pkg_a < pkg_b {
ConflictingPackagePair(pkg_a, pkg_b)
} else {
ConflictingPackagePair(pkg_b, pkg_a)
};
let counter = self.conflicting_packages.entry(packages_key).or_default();
counter.insert(path.clone());
}
if warning_found {
tracing::warn!("Conflicting files were detected");
tracing::warn!(" > This can cause undefined runtime behavior");
tracing::warn!(" > It should be addressed by:");
tracing::warn!(" - not using these packages together");
tracing::warn!(" - removing the file from one of them");
tracing::warn!(" - using alternate versions or components");
}
&mut self.conflicting_packages,
)
.await?;

runtime.status.stack.extend(
pull_task
Expand Down
40 changes: 20 additions & 20 deletions crates/spk-cli/cmd-build/src/cmd_build_test/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -502,39 +502,36 @@ async fn test_package_with_circular_dep_does_not_collect_file_removals(tmpdir: t
// build).
let rt = spfs_runtime().await;

// This extra package "empty" helps create the situation that ends up with
// file removals getting collected in a real world example.
build_package!(
tmpdir,
"circ.spk.yaml",
"empty.spk.yaml",
br#"
api: v0/package
pkg: circ/1.0.0
pkg: empty/1.0.0
build:
script:
- echo "1.0.0" > $PREFIX/version.txt
- echo "hello world" > $PREFIX/hello.txt
- mkdir $PREFIX/subdir
"#
);

build_package!(
tmpdir,
"middle.spk.yaml",
"circ.spk.yaml",
br#"
api: v0/package
pkg: middle/1.0.0
pkg: circ/1.0.0
build:
options:
- pkg: circ
script:
- "true"
install:
requirements:
- pkg: circ
fromBuildEnv: true
"#,
- echo "1.0.0" > $PREFIX/version.txt
- mkdir -p $PREFIX/subdir/v1
- echo "hello world" > $PREFIX/subdir/v1/hello.txt
"#
);

// This build deletes a file that is owned by the previous build. It should
// not be collected as part of the new build.
// This build deletes a subdir that is owned by the previous build. It should
// not be collected (as a mask) as part of the new build.
build_package!(
tmpdir,
"circ.spk.yaml",
Expand All @@ -543,10 +540,13 @@ api: v0/package
pkg: circ/2.0.0
build:
options:
- pkg: middle
- pkg: circ
- pkg: empty
script:
- echo "2.0.0" > $PREFIX/version.txt
- rm $PREFIX/hello.txt
- rm -rf $PREFIX/subdir/v1
- mkdir -p $PREFIX/subdir/v2
- echo "hello world" > $PREFIX/subdir/v2/hello.txt
validation:
rules:
- allow: RecursiveBuild
Expand Down Expand Up @@ -586,10 +586,10 @@ build:
.unwrap()
.to_tracking_manifest();

let entry = manifest.get_path("hello.txt");
let entry = manifest.get_path("subdir/v1");
assert!(
entry.is_none(),
"should not capture file deleted in new build"
"should not capture entry deleted in new build"
);
}

Expand Down
13 changes: 11 additions & 2 deletions crates/spk-exec/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -17,12 +17,21 @@ migration-to-components = [
[dependencies]
async-stream = "0.3"
futures = { workspace = true }
miette = { workspace = true }
relative-path = { workspace = true }
serde_json = { workspace = true }
spfs = { workspace = true }
spk-schema = { workspace = true }
spk-storage = { workspace = true }
spk-solve = { workspace = true }
spk-storage = { workspace = true }
thiserror = { workspace = true }
miette = { workspace = true }
tokio = { workspace = true, features = ["rt"] }
tracing = { workspace = true }

[dev-dependencies]
clap = { workspace = true }
rstest = { workspace = true }
spk-cli-common = { workspace = true }
spk-cmd-build = { workspace = true }
spk-solve-macros = { workspace = true }
tempfile = { workspace = true }
3 changes: 3 additions & 0 deletions crates/spk-exec/src/error.rs
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,9 @@ pub enum Error {
Error(#[from] spfs::Error),
#[error(transparent)]
#[diagnostic(forward(0))]
BuildManifest(#[from] spfs::tracking::manifest::MkError),
#[error(transparent)]
#[diagnostic(forward(0))]
SpkStorageError(#[from] spk_storage::Error),
#[error("Error: {0}")]
String(String),
Expand Down
Loading

0 comments on commit c1f0188

Please sign in to comment.