Inspects Glue Jobs packages and generates an SBOM
- Download the list of provides packages from the AWS documentation and convert into a requirements.txt
- Inspects the parameters from the Glue Job to get the job type: (pythonshell/glueetl), glue version , python version.
- Inspects the jobs of pythonshell libraries are userd
- Inspect the extra packages configured
- Merged all the requirements into a single file
- Exports this as an Sbom
First install UV by following the instructions here: https://docs.astral.sh/uv/getting-started/installation/. Uv is the new fast python package and project manager. It runs automatic in a virtualenv.
Test if you can download the configuration from aws
- Glue Etl https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-python-libraries.html
- Python Shell https://docs.aws.amazon.com/glue/latest/dg/add-job-python.html
uv run glue-inspector download
This will convert the website info into requirements.txt files and store them into ~/.glue_inspector
Set your AWS Credentials in the enviroment: (AWS_PROFILE or AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY) and the correct region.
Run this tool:
uv run glue-inspector inspect mygluejob --output mygluejob-sbom.json
Now you have an sbom in CycloneDX format, with packages from your glue job. This doesn't contain yet vulnerabilities, you can use trivy or any other tool to scan this file
Check the output file with trivy. Trivy is an opensource tools from Aqua and can be found here: https://trivy.dev/latest/getting-started/
trivy sbom mygluejob-sbom.json --scanners vuln,license --list-all-pkgs -d --format table
Or create a new sbom with vulnerabilities included:
trivy sbom mygluejob-sbom.json --scanners vuln,license --list-all-pkgs -d --format cyclonedx --output mygluejob-sbom-trivy.json
I like to write blogpoosts about this. I have included a tool that combine the vulnerabilties into a markdown table:
uv run src/glue_inspector/report/generate-table.py > output.md
The output are also included in the output directory
filename | critical | high | medium | low | information |
---|---|---|---|---|---|
glueetl-2.0 | 5 | 12 | 12 | 1 | 0 |
glueetl-3.0 | 4 | 16 | 20 | 2 | 0 |
glueetl-4.0 | 4 | 14 | 18 | 2 | 0 |
glueetl-5.0 | 0 | 6 | 11 | 3 | 0 |
pythonshell-3.6 | 1 | 1 | 6 | 0 | 0 |
pythonshell-3.9 | 0 | 0 | 0 | 0 | 0 |
pythonshell-3.9-analytics | 1 | 1 | 3 | 0 | 0 |
- moved from poetry to uv
- fixed pythonshell support
- added glueetl 5
- converted the table generator into a better program
- lookup licences from the packages in pypi
- upload as pypi packages
- support packages from internal repos
- support manual packages from s3
- ???