Skip to content

lsm: Use walk API with noxdev for recursive SELinux relabeling#2045

Draft
jeckersb wants to merge 1 commit intobootc-dev:mainfrom
jeckersb:label-noxdev
Draft

lsm: Use walk API with noxdev for recursive SELinux relabeling#2045
jeckersb wants to merge 1 commit intobootc-dev:mainfrom
jeckersb:label-noxdev

Conversation

@jeckersb
Copy link
Collaborator

@jeckersb jeckersb commented Mar 4, 2026

Rewrite ensure_dir_labeled_recurse to use the cap_std_ext walk API
with noxdev instead of manually recursing through directories. This
prevents the recursive labeling from crossing mount point boundaries,
avoiding failures when pseudo-filesystems like sysfs are mounted
under the target root.

Assisted-by: OpenCode (claude-opus-4-6)
Signed-off-by: John Eckersberg jeckersb@redhat.com

@bootc-bot bootc-bot bot requested a review from gursewak1997 March 4, 2026 22:02
cgwalters
cgwalters previously approved these changes Mar 4, 2026
@cgwalters cgwalters enabled auto-merge (rebase) March 4, 2026 22:05
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors ensure_dir_labeled_recurse to use the cap_std_ext walk API with noxdev, which is a good improvement to prevent crossing mount point boundaries during recursive SELinux relabeling. However, a critical security vulnerability has been introduced: the use of ControlFlow::Break(()) prematurely terminates the SELinux relabeling process, leaving the filesystem in an inconsistently labeled state. Additionally, the logic for counting relabeled items no longer includes the root directory, leading to an inaccurate count, and the handling of skipped items incorrectly terminates the entire walk instead of just skipping the specified entry.

if let Some((skip_dev, skip_ino)) = skip {
if (metadata.dev(), metadata.ino()) == (skip_dev, skip_ino) {
tracing::debug!("Skipping dev={skip_dev} inode={skip_ino}");
return Ok(ControlFlow::Break(()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The use of ControlFlow::Break(()) here is a significant security regression. It prematurely terminates the entire directory traversal, leaving subsequent files and directories without correct SELinux labels if a skipped entry is found. This is a regression from the previous implementation that would only skip that single entry. You should use ControlFlow::Continue(()) to skip only the current item and continue the walk. Note that if the skipped entry is a directory, walk() will still descend into it, which might be an unintended side effect if skip can be a directory.

Suggested change
return Ok(ControlFlow::Break(()));
return Ok(ControlFlow::Continue(()));

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm really we should do both; Break() for skipped directories and Continue() for skipped non-directories.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait no I think Gemini is wrong here, it probably just didn't load the docs, but see https://docs.rs/cap-std-ext/latest/cap_std_ext/dirext/trait.CapStdExtDirExt.html#tymethod.walk

Break is documented to be used for exactly this purpose.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a problem with the Gemini review, my understand is what happens on the backend is it might clone the git repo, but it doesn't usually proactively try to load content from outside of that, and so it just works off ambient knowledge, and in this particular case I think it was just guessing about semantics of cap-std-ext.

It's for reasons like this that honestly (as good as it is sometimes) I want to try to go to a place where AI reviews are triggered and owned by humans and we've tweaked them to ensure they include relevant context by default and crucially are running in our default devcontainer so they can actually test things and not just guess.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well I think it's mostly correct though. Usually you'd skip a directory so that should be fine to Break, but if you happen to skip a non-directory using Break is going to skip not just the skipped entry but also its siblings which is not what we want.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We know what we passed in is a directory though. How about a check to that effect + comment here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the "big picture" outside of here, yes we only skip directories, but as far as this little bit of API goes we only have the (dev_t, ino64_t) pair so in theory you could skip regular files for some reason. I'm going to push a new version that checks for both and chooses accordingly but if I'm misunderstanding something we can change it again.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a related note I have some code laying around in a branch or stash somewhere that changes the skip from Option<(libc::dev_t, libc::ino64_t)> to &[(libc::dev_t, libc::ino64_t)] since as we untangle all of this I think if we're going to label the entire physical root we're going to need to skip composefs/objects (after labeling them usr_t prior) and also the ostree deployment root(s), so it's going to need to take multiple things to skip.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would we skip composefs/objects? I think the label on them can just be usr_t too.

We do probably do need to skip the ostree deployment roots, but again I think ostree nowadays is already setting labels for those regardless, and THIS code should be doing "only set label if none exists" right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah that's true, I was thinking after we labeled the composefs objects we would need to be careful not to re-label them in the bulk relabel, but yeah this code skips any labeled files, even if they have the "wrong" label per the policy.

return Ok(ControlFlow::Break(()));
} else {
return Ok(ControlFlow::Continue(()));
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is just wrong let me know but that's how I understand it to work from the docs.

@jeckersb
Copy link
Collaborator Author

jeckersb commented Mar 4, 2026

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request refactors the ensure_dir_labeled_recurse function to utilize the cap_std_ext::dirext::walk API with the noxdev option. This change correctly prevents recursive SELinux relabeling from traversing across mount point boundaries, improving robustness, especially when dealing with pseudo-filesystems like sysfs. However, this introduces a significant bug in the directory skipping logic where using ControlFlow::Break will cause the entire walk to terminate prematurely, potentially leaving many files unlabeled. Additionally, the new implementation follows symlinks at the top level, which is a change in behavior from the previous manual recursion and could lead to unintended relabeling.

Comment on lines +440 to +444
if component.file_type.is_dir() {
return Ok(ControlFlow::Break(()));
} else {
return Ok(ControlFlow::Continue(()));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

In the new implementation of ensure_dir_labeled_recurse, the skip logic for directories returns ControlFlow::Break(()). In the cap_std_ext walk API (and standard Rust ControlFlow usage), Break signals the operation to terminate early. This means that if a directory matches the skip criteria, the entire walk will stop, and all subsequent files and directories will be skipped. This is a regression from the previous manual recursion which correctly continued with siblings. This can lead to a significant portion of the filesystem being left with incorrect SELinux labels if a skipped directory is encountered early in the walk.

To skip a directory and its children while continuing with siblings, the walk API's iterator should be used directly, allowing the use of the skip_current_dir() method. Alternatively, if the walk method with a closure is used, it must be ensured that the return value correctly reflects the intended behavior (though the current cap_std_ext API may not support skipping a directory via the closure's return value without stopping the walk).

Copy link
Collaborator Author

@jeckersb jeckersb Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it's still confused but by my reading of the walk API this is correct now.

.noxdev()
.path_base(path_for_read.as_std_path());

root.open_dir(path_for_read)?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

The function now uses root.open_dir(path_for_read)? to begin the directory walk. In cap-std, open_dir follows symlinks by default. The previous implementation used symlink_metadata and manual recursion, which did not follow symlinks to directories. This change means that if the starting path is a symlink to a directory, bootc will now follow it and relabel the contents of the target directory. This could lead to unintended relabeling of files outside the intended scope if the path is a symlink.

Check if the starting path is a symlink before opening it with open_dir, or use an opening method that does not follow symlinks if the intention is to maintain the previous behavior of not recursing into symlinked directories.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also appears to be AI incorrectness

@jeckersb
Copy link
Collaborator Author

jeckersb commented Mar 4, 2026

error: Installing to filesystem: Creating ostree deployment: Recursive SELinux relabeling of boot: Unexpected SELinuxLabelState::Unsupported during walk at boot/efi

🤨

@cgwalters cgwalters marked this pull request as draft March 5, 2026 12:30
@cgwalters cgwalters marked this pull request as draft March 5, 2026 12:30
@jeckersb
Copy link
Collaborator Author

jeckersb commented Mar 5, 2026

error: Installing to filesystem: Creating ostree deployment: Recursive SELinux relabeling of boot: Unexpected SELinuxLabelState::Unsupported during walk at boot/efi

🤨

I think this is because using noxdev with the walk will prevent the traversal into /boot/efi, but we still try to label it as part of iterating through the entries of /boot.

@cgwalters
Copy link
Collaborator

Hmm yes interesting I would offhand say this is a bug in .noxdev() in walk it should also skip the mountpoint dir entirely.
But we could also work around it here by replicating the openat-noxdev logic for dirs here...

@jeckersb
Copy link
Collaborator Author

jeckersb commented Mar 5, 2026

My current plan is... if we get Unsupported, check whether it's a mounpoint, and if so skip it.

One minor problem though is if we encounter a mountpoint for a filesystem that does support labels, we will label the root of that filesystem but not traverse any further. To avoid that we would need to check every directory we encounter to see if it's a mountpoint before labeling, which seems rather expensive.

@jeckersb
Copy link
Collaborator Author

jeckersb commented Mar 5, 2026

As an aside it's mildly annoying that this seems to pass in all of our tests other than Test install since that one is more bespoke to the CI runner and is not easily reproducible locally with just like everything else.

@jeckersb
Copy link
Collaborator Author

jeckersb commented Mar 5, 2026

My current plan is... if we get Unsupported, check whether it's a mounpoint, and if so skip it.

Added this for now just to see if it passes or fails in some other way

@cgwalters
Copy link
Collaborator

As an aside it's mildly annoying that this seems to pass in all of our tests other than Test install since that one is more bespoke to the CI runner and is not easily reproducible locally with just like everything else.

Yeah 🙁 - I think we can try hard cutting those over to a tmt flow too.

That said we can optimize this a bit: the install tests don't actually need a full bcvk libvirt they just need bcvk ephemeral which would be cheaper.

@jeckersb
Copy link
Collaborator Author

jeckersb commented Mar 5, 2026

Ok updated with temporary cargo patch to point at coreos/cap-std-ext#88, so now we call .skip_mountpoints() and remove the logic here that previously checked for mountpoints.

…relabeling

Rewrite ensure_dir_labeled_recurse to use the cap_std_ext walk API
with noxdev and skip_mountpoints instead of manually recursing through
directories. This prevents the recursive labeling from crossing mount
point boundaries, avoiding failures when pseudo-filesystems like sysfs
are mounted under the target root.

Assisted-by: OpenCode (claude-opus-4-6)
Signed-off-by: John Eckersberg <jeckersb@redhat.com>
@jeckersb
Copy link
Collaborator Author

jeckersb commented Mar 5, 2026

Also updated the docstring on ensure_dir_labeled_recurse to mention skip_mountpoints.

@jeckersb
Copy link
Collaborator Author

jeckersb commented Mar 5, 2026

Ok this looks pretty good now to me, the only failure was a network flake:

content: error: Installing to disk: Creating ostree deployment: Preparing import: Fetching manifest: failed to invoke method GetManifest: reading manifest sha256:12b0be0890a35baac8bdec7157918cf2dc6aa7ee28cc64ffd2e0cb682250d789 in quay.io/centos-bootc/centos-bootc: received unexpected HTTP status: 500 Internal Server Error

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants