Skip to content

Conversation

@dustymabe
Copy link
Member

And also try to normalize the output to make the diffs more meaningful.

See individual commit messages.

This obsoletes #4333 (thanks @jbtrystram)

dustymabe and others added 3 commits November 5, 2025 14:38
In the case of `cosa diff --gc` there will be no active differs so
let's just move the gc code earlier and conditionalize on active_differs
being non-empty for the rest of it.
This approach use more disk space but disk access for the diff will
be faster. Files will also survive under after git diff returns in
case manual inspection is desired.

Co-authored-by: Jean-Baptiste Trystram <jbtrystram@redhat.com>
The diffs contain a bunch of files with sha256 checksums in them, which
are unique on every build. Let's attempt to normalize the output
directories so the diffs will me less noisy and more meaningful.

Also fixup some permissions on some files because they can't be diffed
otherwise.
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the cmd-diff command to extract files from metal images using guestfs and tar instead of a FUSE mount. This is a great simplification that removes complexity and potential fragility associated with multiprocessing and FUSE. The introduction of filename normalization for hashes is also a valuable improvement for generating more meaningful diffs. The overall changes are positive, but I have a couple of suggestions to enhance the robustness of the new implementation.

Comment on lines +583 to +588
# Some files like /etc/shadow and sudo have no read permissions so let's
# open it up so the difftool can access it.
runcmd(['find', diff_dir, '-type', 'f', '-perm', '000',
'-exec', 'chmod', '--verbose', '444', '{}', ';'])
runcmd(['find', diff_dir, '-type', 'f', '-perm', '111',
'-exec', 'chmod', '--verbose', '555', '{}', ';'])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

These find commands to change file permissions are very specific and only handle modes 000 and 111. Other unreadable file modes (e.g., 200 for write-only) won't be fixed. Additionally, using -exec ... ';' is inefficient as it runs chmod for every file. A more robust and efficient approach is to use a single find command to add read permission for the owner to all files, and use -exec ... '+' to process multiple files at once.

        # Some files may not have read permissions. Add read permission for the
        # owner to all files to ensure that the difftool can access them.
        runcmd(['find', diff_dir, '-type', 'f', '-exec', 'chmod', 'u+r', '{}', '+'])

@jbtrystram
Copy link
Member

Thank you for picking that up Dusty ! I'll give it a try today but the change LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants