Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migration of trivy fs to trivy repo #4887

Open
1 of 2 tasks
knqyf263 opened this issue Jul 30, 2023 · 5 comments
Open
1 of 2 tasks

Migration of trivy fs to trivy repo #4887

knqyf263 opened this issue Jul 30, 2023 · 5 comments
Assignees

Comments

@knqyf263
Copy link
Collaborator

knqyf263 commented Jul 30, 2023

Description

Currently, Trivy has the following main subcommands:

  • trivy image: Scan local or remote container images
  • trivy vm: Scan local or remote VM images
  • trivy fs: Scan local filesystem
  • trivy repo: Scan remote Git repositories

However, trivy fs command can be somewhat confusing for users. See #3293, #2518, #1543, #1884, #1477, and many examples. The file system scan was originally designed to scan code repositories, and it is intended to be used for scanning repositories locally or in a CI environment. Therefore, unlike container image scanning, it targets lock files such as package-lock.json and does not target artifacts like JAR files, binary files, etc.

In reality, the target for trivy fs is the same as the one for trivy repo. While trivy image and trivy vm support both local and remote targets, trivy fs only supports local repositories and trivy repo only supports remote repositories, which is not aligned with the other commands.

After this change, the subcommands would be as below and it would be more consistent:

  • trivy image: Scan local or remote container images
  • trivy vm: Scan local or remote VM images
  • trivy repo: Scan local or remote Git repositories

Therefore, I propose to deprecate trivy fs and enhance trivy repo to support both local and remote repositories. This will make it more straightforward that it is a subcommand for scanning code repositories.

Implementation

When the argument passed to trivy repo is HTTP/HTTPS protocol, it will scan remotely. If not, it will assume it is a local file path and perform the scan. This change will improve the user experience and make the command usage more intuitive.

Note

Considering the large number of users currently using trivy fs, we will not remove this command immediately. Instead, we will encourage users to transition to using trivy repo and provide a grace period of a few years before trivy fs is fully deprecated. This will give users enough time to adjust their workflows accordingly.

Plan

  1. Add support for local repositories in trivy repo, which is equivalent to trivy fs. (v0.44.0)
  2. Deprecate trivy fs and announce it (v0.45.0)
    • Remove trivy fs from the documentation
    • But the subcommand is still available
  3. Remove trivy fs at some point (TBD)

Tasks

Preview Give feedback
  1. knqyf263
  2. lifecycle/stale
@knqyf263 knqyf263 self-assigned this Jul 30, 2023
@knqyf263 knqyf263 changed the title Enhance trivy repo for local and remote code repository scanning Migration of trivy fs to trivy repo Jul 30, 2023
@tbutler-qontigo
Copy link

Hi

Whilst I applaude a drive for consistency in any product, I'm not sure your plan will reduce confusion particularly, which seems to be one of the drivers for this task.

I'm not sure it is intuitive to use a "repo" scan to scan a local file system - maybe I don't even have a repo, I'm just playing around on my local file system - but assuming this is the right way to go then, looking at the issues you reference where confusion has arisen, it doesn't seem like simply adding a local option to trivy repo will help these cases at all TBH:

None of these issues will be solved by removing trivy fs and adding a local repo option to trivy repo - that will just lead to users using trivy repo in local mode instead and still be told to use trivy rootfs instead.

Should the story actually be to add .jarand .war scanning to more commands - which could include trivy fs or trivy repo if you remove trivy fs?

Equally, why not allow trivy image (and trivy rootfs?) scan things they are missing - like .lock files - I may well be building container images which contain lock files and it would be desirable to scan them.

Or indeed maybe the story should be about removing the confusion between trivy fs and trivy rootfs - perhaps these are the ones that should be combined since they are both about scanning file systems.
Couldn't for example trivy fs / indicate this is a rootfs type of scan and do whatever extra is required on top of a trivy fs scan to make it a "rootfs" scan - though fundamentally they should both support the same file types - gemspec, .lock files etc.

Any way just some thoughts to take or leave - I really appreciate the awesome work you guys do on this product.

thanks

@knqyf263
Copy link
Collaborator Author

knqyf263 commented Jul 31, 2023

Thanks for your input.

Should the story actually be to add .jarand .war scanning to more commands - which could include trivy fs or trivy repo if you remove trivy fs?

There are two main types of scan targets in Trivy:

  • Pre-build
  • Post-build

In the case of container images, it's post-build, and files required before the build (like pom.xml) are likely not included. Therefore, it scans artifacts like JAR files. On the other hand, in the case of Git repositories, post-build artifacts are likely not included, so it needs to scan pom.xml instead of JAR files.

Now, if we were to scan both, there are several problems:

  1. Duplication of results
    For example, it's a bit difficult to identify which JAR file was generated from which pom.xml, so the two results would overlap. This applies not only to JARs but also to Go and Rust binaries.

  2. Slower scan execution
    Scanning JAR files and the like takes time, so scanning both pom.xml and JARs reduces the efficiency of the scan. Even though it would have been sufficient to scan the pom.xml, it ends up scanning the JAR.

For these reasons, we divide the scan targets depending on the target.

  • image: post-build
  • vm: post-build
  • repo: pre-build
  • rootfs: post-build

In other words, if it becomes clearer why artifacts are not scanned because of the name "repo", we believe confusion should decrease.

Of course, as discussed in this issue, adding --enable-analyzers and --disable-analyzers would make it more flexible, but the default behavior should be to scan either pre-build files or post-build artifacts.

For "fs", we could make it a special command that scans both, but changing the behavior of existing subcommands has a significant impact and is not easy.

@tbutler-qontigo
Copy link

Hi

re: pre and post build - I'm not sure this is particularly obvious, not in the documentation anyway, of the split between same - I can see your thinking from the above. Maybe the documentation needs "a bit of a refactor" to surface this way of thinking about the scan types?
Whilst one could, with a bit of thinking, probably arrive at an approximation of the breakdown you have given, rootfs is somewhat troublesome since the docs say it is to "Scan a root filesystem (such as a host machine, a virtual machine image, or an unpacked container image filesystem)." - there is nothing to suggest that scanning a host system is a "post-build" operation though (with some prompting) you could perhaps see that the other two are "post-build".

In other words, if it becomes clearer why artifacts are not scanned because of the name "repo", we believe confusion should decrease.

Perhaps - but only if it is already clear about the pre/post-build divide.
What is less clear however is that I should use a "repo" scan to scan my local file system.

Also, you talk about what may or may not be "likely" - Does it make sense to second-guess the use-cases people may have for containers (and for trivy)? You may be unintentionally limiting how it can be used.
Maybe I need to test that the container I build is secure as a container but also the content I am putting inside the container is secure/well written (which could be code/config to enable me to build something).

The point being that more flexibility in how trivy can be used (vs how you envision it being used) opens up more possibilities for the end user.

Perhaps, as you allude to, it would make sense to add options to allow users to configure exactly which scans they want to run - so if you want both pre and post-build analyses in one trivy run then you have this ability.
image, vm, repo, rootfs then become simply pre-defined collections of scan targets but we would also have the ability to "mix and match" any/all of them however we like?
This could be achieved by:

  • removing the requirement for a target - not specifying an explicit image, repo etc would then require the additional include or exclude options (prefer the former personally)
  • A new target (omni? :) ) would be added which takes the include/exclude options, allowing you to mix and match any and every scan target
  • The existing scan targets get additional options to include what they are currently missing (and potentialy to exclude some things they currently include)

That last option essentially means any of the targets could be used to execute all of the scans (with appropriate options) but the default behaviour is what they do today

cheers

@okainov
Copy link

okainov commented Aug 11, 2023

Please note documentation needs to be updated as well, currently in 0.44.0 I see in CLI help messages:

Scanning Commands
aws [EXPERIMENTAL] Scan AWS account
config Scan config files for misconfigurations
filesystem Scan local filesystem
image Scan a container image
kubernetes [EXPERIMENTAL] Scan kubernetes cluster
repository Scan a remote repository

repository should mention "local filesystem" as well

@TicianoH
Copy link

still confusing but luckily I got here, and will use trivy repo instead of fs now...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

No branches or pull requests

4 participants