Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backport for Yangtze: Fix unclosed file objects in TarOutput #124

Conversation

bernhardkaindl
Copy link
Collaborator

@bernhardkaindl bernhardkaindl commented Sep 13, 2024

Backport to the Yangtze release branch:

Fix unclosed file objects when tar files are created, like when /etc/systemd.tar is created and collected(systemd units) and when --output is used to output to tar files.

This pull request fixes an issue in the TarOutput class where file objects were not being closed properly.

The fix involves using context managers to ensure that the file objects are closed after they are read.

This fix reduces memory usage and may improve the performance of the xen-bugtool / xenserver-status-report process in case tarballs instead of ZIP files are used as output format.

With tar output, TarOutput.addRealFile() did not close file objects.
So in tar mode, the file objects of all captures regular files were
not closed.

As the fixed lines use `TarFile.addfile(tarinfo, open("fn"))`, the
opened file object could have never been passed anywhere else.
Thus, it is safe to fix this pattern using a context manager.

Summary:

- The changes to xen-bugtool are 4 safe lines to use `with open`
- All other changes are improvements for the unit tests to cover it.

Description:

This was found by PYTHONDEVMODE=yes to trace unclosed resources.
I set this up in .pre-commit-config.yaml, so while developing
tests, I saw the warning while testing with pre-commit run -v.

Fixes in this PR:

- Use context managers to fix the locations which are easy to fix.
- Extend the unit tests (those in master now) to cover both locations.
- While updating those tests, improve them to check the file list of
  the ZIP and tar archives generated in this particular test case.
- Configure GitHub CI to create source annotations for PYTHONWARNINGS
  which are issued by such cases and are shown in code review of PRs.

Notes:

In the course of a year, my development machine accumulated 24000
files in /var/xapi/blobs (spread evenly over all subdirectories),
so Python apparently does some garbage collection, but it's certainly
not good to never close so many open file-like objects in Python.

Closing them in time has the potential to reduce the memory usage
of the xen-bugtool / xenserver-status-report process.

There are more locations where Python 3.10 reports unclosed file
objects, but it seems we may have to use enable tracemalloc as
recommended by the issued PYTHONWARNINGS to find their source.

Signed-off-by: Bernhard Kaindl <bernhard.kaindl@cloud.com>
@bernhardkaindl bernhardkaindl changed the title Fix unclosed file objects in TarOutput Backport for Yangtze: Fix unclosed file objects in TarOutput Sep 13, 2024
@bernhardkaindl bernhardkaindl merged commit c9b2e02 into xenserver:release/yangtze Sep 25, 2024
@bernhardkaindl bernhardkaindl deleted the tar-output-close-files branch September 25, 2024 16:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants