Skip to content

Conversation

@Siryu6
Copy link

@Siryu6 Siryu6 commented Oct 29, 2025

Description

This PR implements a dry_run feature for reimport-scan operations that allows users to preview what changes would occur during a reimport without making any database modifications.

Key Features:

  • Optional dry_run boolean parameter (defaults to false) on reimport-scan API endpoints
  • Returns detailed statistics about findings that would be created, reactivated, closed, or left untouched
  • Respects all existing reimport parameters (close_old_findings, do_not_reactivate, severity filters, etc.)
  • Non-breaking: all existing functionality remains unchanged

Use Case:
This feature is particularly useful for CI/CD pipelines where teams want to validate scan results and preview changes before merging to production environments.

API Response Example:

{
  "test_id": 123,
  "dry_run": true,
  "changes_preview": {
    "would_create": 5,
    "would_reactivate": 2,
    "would_close": 3,
    "would_leave_untouched": 10,
    "total_changes": 10
  },
  "findings_details": {
    "new_findings": [...],
    "reactivated_findings": [...],
    "closed_findings": [...],
    "unchanged_findings": [...]
  }
}

Documentation

API documentation is included in the serializer field descriptions (help_text).

Files Changed

  • dojo/api_v2/serializers.py - API integration for dry_run parameter
  • dojo/importers/default_reimporter.py - Core dry_run_reimport() logic
  • dojo/importers/default_importer.py - Return signature update
  • dojo/importers/options.py - dry_run parameter validation
  • dojo/fixtures/dojo_testdata.json - Fixed missing SLA_Configuration
  • dojo/engagement/views.py - View layer dry_run support
  • dojo/test/views.py - View layer dry_run support
  • unittests/dojo_test_case.py - Test helper methods
  • unittests/test_import_reimport.py - Minor test adjustments
  • unittests/test_import_reimport_dry_run.py - New comprehensive test suite

Labels to add:

  • enhancement
  • feature

Implements optional dry_run parameter for reimport-scan API endpoints
to preview changes without modifying the database. Useful for CI/CD
pipelines to validate scan results before merging to production.

- Add dry_run parameter to reimport-scan API endpoints
- Implement dry_run_reimport() method respecting all reimport options
- Return detailed preview statistics and findings details
- Add comprehensive test suite with 10 test cases
- Fix missing SLA_Configuration in test fixtures
@dryrunsecurity
Copy link

dryrunsecurity bot commented Oct 29, 2025

DryRun Security

🔴 Risk threshold exceeded.

This pull request modifies multiple importer files (dojo/importers/default_reimporter.py, default_importer.py, options.py) with sensitive-path edits flagged by the scanner, and introduces an information disclosure risk where the reimport_scan dry_run serializes and returns detailed finding data (titles, descriptions, file paths, CVEs, etc.) under only Product_Type_Member/Product_Member checks, potentially exposing sensitive vulnerability details to users without finer-grained authorization.

🔴 Configured Codepaths Edit in dojo/importers/options.py
Vulnerability Configured Codepaths Edit
Description Sensitive edits detected for this file. Sensitive file paths and allowed authors can be configured in .dryrunsecurity.yaml.
🔴 Configured Codepaths Edit in dojo/importers/default_reimporter.py
Vulnerability Configured Codepaths Edit
Description Sensitive edits detected for this file. Sensitive file paths and allowed authors can be configured in .dryrunsecurity.yaml.
🔴 Configured Codepaths Edit in dojo/importers/default_reimporter.py
Vulnerability Configured Codepaths Edit
Description Sensitive edits detected for this file. Sensitive file paths and allowed authors can be configured in .dryrunsecurity.yaml.
🔴 Configured Codepaths Edit in dojo/importers/default_reimporter.py
Vulnerability Configured Codepaths Edit
Description Sensitive edits detected for this file. Sensitive file paths and allowed authors can be configured in .dryrunsecurity.yaml.
🔴 Configured Codepaths Edit in dojo/importers/default_importer.py
Vulnerability Configured Codepaths Edit
Description Sensitive edits detected for this file. Sensitive file paths and allowed authors can be configured in .dryrunsecurity.yaml.
Information Disclosure in Dry Run Mode in dojo/importers/default_reimporter.py
Vulnerability Information Disclosure in Dry Run Mode
Description The dry_run mode in the re-importer serializes detailed finding information, including title, severity, description, cwe, cve, cvssv3, component_name, component_version, file_path, and line. This serialized data is returned in the API response. The reimport_scan API endpoint, which handles this functionality, only checks for Product_Type_Member or Product_Member permissions. These permissions are broad and do not granularly restrict access to specific finding details. Therefore, a user with these permissions could potentially trigger a dry run and view sensitive vulnerability details (e.g., detailed descriptions, file paths, CVEs) for findings they might not be authorized to see through the standard UI, or for findings that are otherwise mitigated or suppressed.

class DefaultReImporter(BaseImporter, DefaultReImporterOptions):
"""
The classic reimporter process used by DefectDojo
This importer is intended to be used when mitigation of
vulnerabilities is the ultimate tool for getting a current
point time view of security of a given product
Dry Run Mode:
-------------
When dry_run=True, the importer performs a simulation of the reimport process
without making any database changes. This allows users to preview what would
happen during a real reimport.
The dry_run mode uses in-memory tracking to accurately simulate deduplication,
including matches between findings within the same scan report. This means that
if finding 100 and 101 in the report have the same hash_code, finding 101 will
correctly be identified as a duplicate of finding 100, just as in a real import.
Known Limitations in Dry Run Mode:
- Endpoint updates are not simulated
- Finding groups are not processed
- JIRA integration is skipped
- No notifications are sent
- Test/engagement timestamps are not updated
"""
def __init__(self, *args, **kwargs):

We've notified @mtesauro.


All finding details can be found in the DryRun Security Dashboard.

Copy link
Member

@valentijnscholten valentijnscholten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR. We'll take a look and see if the feature is something that we can add without adding too much maintenance burden. Although we have lifted the feature freeze, it's still advised to double check before starting a PR.

Some initial thoughts:

  • More than half of the changes in the PR seem to be cosmetic. Would it be possible to revert these cosmetic changes so it becomes easier to digest and review?
  • The dry_run now seems to be a code path seperate from the "real" reimport. Have you considered using the rel reimport code path and add some if-statements to make a dry_run mode? Two code paths increase the risk of differences/mismatches.
  • The reimport matching logic can be influenced by finding that are created during reimport. For example if finding 100 and 101 in the report are new and both have the same hash_code, the finding 101 will not be imported as it is matched against finding 100. In dry_run mode this match won't happen, so the statistics might be different. I haven't verified this, but this is what I am thinking.
  • The reimport process might change in the near future where it will perform batch based matching and/or batch processing of the report. I am not sure if it will be possible to maintain a 100% complete dry_run feature.

@Siryu6
Copy link
Author

Siryu6 commented Oct 30, 2025

More than half of the changes in the PR seem to be cosmetic. Would it be possible to revert these cosmetic changes so it becomes easier to digest and review?

Yeah my bad, that's my IDE 😅 I'll revert that.

The dry_run now seems to be a code path seperate from the "real" reimport. Have you considered using the rel reimport code path and add some if-statements to make a dry_run mode? Two code paths increase the risk of differences/mismatches.

I first did that but it was lot of if conditions, didn't like it and decided to separate it, if you prefer this i can switch to that.

The reimport matching logic can be influenced by finding that are created during reimport. For example if finding 100 and 101 in the report are new and both have the same hash_code, the finding 101 will not be imported as it is matched against finding 100. In dry_run mode this match won't happen, so the statistics might be different. I haven't verified this, but this is what I am thinking.

I'll verify its properly implemented and fix it if it's not the case.

The reimport process might change in the near future where it will perform batch based matching and/or batch processing of the report. I am not sure if it will be possible to maintain a 100% complete dry_run feature.

I guess that's a good reason to switch to if statements 😅

These files contained only formatting changes (line breaks, spacing)
that made the diff harder to review. Reverting to focus the PR on
the core dry_run functionality.
This addresses maintainer feedback by:

1. Eliminating duplicate code path: Removed the separate dry_run_reimport()
   method that duplicated most of process_findings() logic. Now dry_run uses
   the same code path as regular reimport with conditional checks.

2. Implementation approach:
   - Added 'if self.dry_run' conditionals at key points to skip DB writes
   - Created categorize_matched_finding_for_dry_run() helper for matching logic
   - Modified process_scan() to handle dry_run within the main flow
   - Updated process_findings() to skip DB operations in dry_run mode
   - Modified close_old_findings() to return early in dry_run mode
   - Updated process_results() to build findings_details for dry_run

3. Benefits:
   - Single code path reduces maintenance burden
   - Changes to reimport logic automatically apply to dry_run
   - Easier to review and understand
   - Reduced code size by 49 lines

4. Known limitations (now documented):
   - Findings with same hash_code in one report won't match each other
     in dry_run (since first isn't saved to DB)
   - This may show slightly more 'new' findings than would actually occur
   - This is an acceptable limitation of dry_run mode

The refactoring maintains the same external API and behavior while
significantly simplifying the implementation.
@github-actions github-actions bot removed the apiv2 label Nov 5, 2025
This addresses the maintainer's concern about finding matching within
the same scan report.

Problem:
If findings 100 and 101 in the same report have identical hash_codes,
in a real import finding 101 would match against finding 100 (which was
just saved to the DB). In the previous dry_run implementation, this match
would not occur since finding 100 was never saved, leading to inaccurate
statistics.

Solution:
1. Track new findings in-memory during dry_run (self.dry_run_new_findings)
2. Updated match_new_finding_to_existing_finding() to check both:
   - Database findings (existing behavior)
   - In-memory findings from current scan (new for dry_run)
3. Split matching logic into helper methods:
   - _get_db_matches(): Query database for matches
   - _get_in_memory_matches(): Check in-memory findings (dry_run only)
4. When a new finding is created in dry_run, add it to the tracking list

Result:
Dry run now accurately simulates deduplication within the same scan report,
providing statistics that match what would actually happen in a real import.
This makes the dry_run feature much more reliable for previewing imports.

Updated documentation to reflect that this limitation has been resolved.
@Siryu6
Copy link
Author

Siryu6 commented Nov 5, 2025

Thank you for the detailed feedback! I've addressed all three concerns:

  1. Cosmetic changes: Reverted in commit c5135d8
  2. Duplicate code path: Refactored to use single path with conditionals in commit 3664582
  3. Finding matching within report: Implemented in-memory deduplication in commit 23a9b49

The dry_run mode now uses the same code path as regular reimport and accurately simulates deduplication, including matches within the same scan report.

@valentijnscholten
Copy link
Member

Still lots of formatting changes. We're going to have to put this on hold until after the reimport changes we've planned for Q1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants