Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set safe default extraction filter for tar archives #19406

Merged
merged 3 commits into from
Jan 18, 2025

Conversation

nsoranzo
Copy link
Member

PEP 706, first implemented in Python 3.11.4, mitigates some of the security issues of TarFile.extract() and TarFile.extractall() by allowing to specify a filter keyword-only parameter.
Set a safe default (data_filter) for the filter if available, reverting to Python 3.11 behavior ('fully_trusted') otherwise, see https://docs.python.org/3/library/tarfile.html#supporting-older-python-versions

How to test the changes?

(Select all options that apply)

  • I've included appropriate automated tests.
  • This is a refactoring of components with existing test coverage.
  • Instructions for manual testing are as follows:
    1. [add testing steps and prerequisites here if you didn't write automated tests covering all your changes]

License

  • I agree to license these and all my past contributions to the core galaxy codebase under the MIT license.

@jdavcs
Copy link
Member

jdavcs commented Jan 16, 2025

@nsoranzo Would it make sense to factor out all the tar_archive.extraction_filter = getattr(tarfile, "data_filter", (lambda member, path: member)) into a helper, and then add a comment to that helper? The PEP suggests to do so ("..This is an unsafe operation, so it should be spelled out explicitly, ideally with a comment")

@nsoranzo nsoranzo force-pushed the TarFile.extraction_filter branch from d67c1a1 to c71f410 Compare January 17, 2025 18:49
@nsoranzo
Copy link
Member Author

@nsoranzo Would it make sense to factor out all the tar_archive.extraction_filter = getattr(tarfile, "data_filter", (lambda member, path: member)) into a helper, and then add a comment to that helper? The PEP suggests to do so ("..This is an unsafe operation, so it should be spelled out explicitly, ideally with a comment")

That was a good idea suggestion, I've extended the existing CompressedFile wrapper for that, and used it everywhere we are extracting from an archive.

Copy link
Member

@jdavcs jdavcs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thank you!

EDIT: test failures seem relevant.

[PEP 706](https://peps.python.org/pep-0706/), first implemented in Python
3.11.4, mitigates some of the security issues of `TarFile.extract()` and
`TarFile.extractall()` by allowing to specify a `filter` keyword-only
parameter.
Set a safe default (`data_filter`) for the filter if available,
reverting to Python 3.11 behavior ('fully_trusted') otherwise, see
https://docs.python.org/3/library/tarfile.html#supporting-older-python-versions

Also:
- Remove unused `tar` parameter from `upload_tar()`
to ensure that the archive is closed.
@nsoranzo nsoranzo force-pushed the TarFile.extraction_filter branch from c71f410 to 54df792 Compare January 18, 2025 15:04
@nsoranzo
Copy link
Member Author

EDIT: test failures seem relevant.

Fixed, thanks for the review!

@nsoranzo nsoranzo merged commit 42bf463 into galaxyproject:dev Jan 18, 2025
52 of 55 checks passed
@nsoranzo nsoranzo deleted the TarFile.extraction_filter branch January 18, 2025 16:48
Copy link

This PR was merged without a "kind/" label, please correct.

@nsoranzo nsoranzo added kind/enhancement kind/refactoring cleanup or refactoring of existing code, no functional changes labels Jan 18, 2025
@nsoranzo nsoranzo mentioned this pull request Mar 11, 2025
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/testing area/tool-framework area/util kind/enhancement kind/refactoring cleanup or refactoring of existing code, no functional changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants