Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: Include Private Storage in Orphan File Scanning for filer_check Command #1518

Open
wants to merge 13 commits into
base: master
Choose a base branch
from

Conversation

Baraff24
Copy link
Contributor

@Baraff24 Baraff24 commented Mar 10, 2025

This pull request addresses an issue where the filer_check management command was only scanning the public media directory for orphaned files, ignoring the private storage (used for secure downloads). With this fix, the command now iterates through all storages configured in FILER_STORAGES—including private storage—and correctly detects orphaned files.

Changes include:

  • Updating the storage scanning logic to iterate over all defined storages.
  • Using a label prefix (e.g. "public" or "private") to ensure the output format matches the expected results.
  • Adjusting the verbosity handling for missing file references to comply with test expectations.
  • Adding tests to verify that orphaned files in private storage are detected and removed correctly.

Related resources

Checklist

  • I have opened this pull request against master
  • I have added or modified tests when changing logic
  • I have followed the conventional commits guidelines to add meaningful information into the changelog
  • I have read the contribution guidelines and I have joined #workgroup-pr-review on Slack to find a “pr review buddy” who is going to review my pull request.

Summary by Sourcery

Extends the filer_check command to scan all configured storages for orphaned files, including private storage, and adds tests to verify the detection and removal of orphaned files in private storage.

Bug Fixes:

  • Fixes an issue where the filer_check command only scanned the public media directory for orphaned files, ignoring private storage.
  • Fixes verbosity handling for missing file references to comply with test expectations.
  • Fixes a bug where the command was not correctly detecting orphaned files in private storage due to not iterating over all storages configured in FILER_STORAGES.
  • Fixes a bug where the output format did not match the expected results due to missing label prefixes (e.g. "public" or "private").

Tests:

  • Adds tests to verify that orphaned files in private storage are detected and removed correctly.

Copy link
Contributor

sourcery-ai bot commented Mar 10, 2025

Reviewer's Guide by Sourcery

This pull request modifies the filer_check management command to include private storage when scanning for orphaned files. It iterates through all storages defined in FILER_STORAGES, using a label prefix for the output. Tests have been added to verify the correct detection and removal of orphaned files in both public and private storage.

Sequence diagram for filer_check command with orphaned files in private storage

sequenceDiagram
    participant User
    participant Command
    participant Storage
    participant Database

    User->>Command: Run filer_check --orphans
    activate Command
    Command->>Command: Iterate through FILER_STORAGES (public, private, ...)
    Command->>Storage: listdir(UPLOAD_TO_PREFIX) for each storage
    activate Storage
    Storage-->>Command: Returns list of files
    deactivate Storage
    loop For each file in storage
        Command->>Database: Check if file exists in File model
        activate Database
        Database-->>Command: Returns True/False
        deactivate Database
        alt File does not exist in database (orphan)
            Command->>Command: Identify as orphan
            Command->>User: Output orphaned file path
        end
    end
    deactivate Command
Loading

Sequence diagram for filer_check command with delete orphaned files in private storage

sequenceDiagram
    participant User
    participant Command
    participant Storage
    participant Database

    User->>Command: Run filer_check --orphans --delete-orphans
    activate Command
    Command->>Command: Iterate through FILER_STORAGES (public, private, ...)
    Command->>Storage: listdir(UPLOAD_TO_PREFIX) for each storage
    activate Storage
    Storage-->>Command: Returns list of files
    deactivate Storage
    loop For each file in storage
        Command->>Database: Check if file exists in File model
        activate Database
        Database-->>Command: Returns True/False
        deactivate Database
        alt File does not exist in database (orphan)
            Command->>Storage: delete(file)
            activate Storage
            Storage-->>Command: File deleted
            deactivate Storage
            Command->>User: Output deleted file path
        end
    end
    deactivate Command
Loading

File-Level Changes

Change Details Files
The filer_check command now iterates through all storages defined in FILER_STORAGES to detect orphaned files, including private storage.
  • Iterate through all storages configured in FILER_STORAGES.
  • Use a label prefix (e.g. "public" or "private") for output.
  • Adjust verbosity handling for missing file references.
filer/management/commands/filer_check.py
Added tests to verify that orphaned files in private storage are detected and removed correctly.
  • Added test_delete_orphans_private to verify private storage orphan detection and removal.
  • Added test_delete_orphans_public to verify public storage orphan detection and removal.
  • Refactored existing tests to improve clarity and avoid interference.
tests/test_filer_check.py

Possibly linked issues


Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!
  • Generate a plan of action for an issue: Comment @sourcery-ai plan on
    an issue to generate a plan of action for it.

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @Baraff24 - I've reviewed your changes - here's some feedback:

Overall Comments:

  • Consider extracting the storage walking logic into a separate function or class to improve readability and maintainability.
  • The verbosity handling could be more consistent across the different checks (orphans, missing files, image dimensions).
Here's what I looked at during the review
  • 🟢 General issues: all looks good
  • 🟢 Security: all looks good
  • 🟡 Testing: 2 issues found
  • 🟢 Complexity: all looks good
  • 🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@vinitkumar
Copy link
Member

@Baraff24 Thank for your contribution. Please have a look at the failing tests and fix those. Also, have a look at the suggestion by sourcery once, They do give useful and pragmatic suggestions sometimes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants