Skip to content

Scanner: Generate PDF security reports #2

@mgoldsborough

Description

@mgoldsborough

Problem

The scanner only generates JSON reports (report.json) and uploads them to S3. The database schema has a pdf_s3_uri column and the callback payload accepts pdf_s3_uri, but no PDF is ever generated or uploaded.

Users on mpak.dev cannot download a PDF security report for a package.

Current State

  • Scanner (job.py) generates scan-results/{scan_id}/report.json on S3
  • security_scans.pdf_s3_uri column exists but is always NULL
  • Callback schema accepts pdf_s3_uri but scanner never sends it
  • Web UI renders scan results inline from JSON, no PDF download link

Desired Behavior

  1. Scanner generates a PDF after scanning completes, containing:

    • Package name, version, scan date
    • MTF certification level and badge
    • Risk score summary
    • Controls passed/failed with details
    • Vulnerability findings by severity
    • SBOM/dependency list
    • Secrets and malicious code findings (if any)
  2. Scanner uploads PDF to S3 at scan-results/{scan_id}/report.pdf

  3. Scanner includes pdf_s3_uri in the callback payload

  4. Web UI adds a "Download PDF" button on the package security section

Implementation Notes

  • Python PDF library options: reportlab, weasyprint, fpdf2
  • Keep it simple: the PDF should be a clean, professional report, not a pixel-perfect replica of the web UI
  • The scanner already has all the data needed (the ScanReport model)
  • PDF generation should not block the callback; if PDF fails, still send JSON results

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions