Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run themis on deps #817

Closed
wants to merge 85 commits into from
Closed

Run themis on deps #817

wants to merge 85 commits into from

Conversation

spatten
Copy link
Contributor

@spatten spatten commented Feb 22, 2022

Overview

Closes ANE-25 — Use themis to license scan vendored dependencies
Closes ANE-26 — Upload cli-side license data to S3 and trigger the build

Run themis-cli on the list of vendored dependencies in fossa-deps.yml, upload the resulting scan to S3 and then trigger a build on Core.

Things this PR does not do:

There are also no tests. I think that this code is similar to other code-paths that are not tested, but please let me know if there are areas that are reasonable to test.

Acceptance criteria

This PR changes the behavior for vendored dependencies found in fossa-deps.yml.

  • If archiveOrCLI in App.Fossa.ManualDeps is set to ArchiveUpload, then there should be no change: we should run an archive upload on any vendored dependencies
  • If archiveOrCLI is set to CLILicenseScan, then we should run a license scan on any vendored dependencies

Running a license scan means:

  • we run themis the directories pointed to by the vendored dependency
  • the results from each of the themis scans are gzipped and uploaded to S3
  • Once all of the scans are completed and uploaded, we trigger a CLILicenseScanBuild job for each vendored dependency.
  • archiveOrCLI must be set to ArchiveUpload before this is merged into master

Testing plan

Core setup

Check out the CLI license scan endpoint PR. The branch name is cli-license-scan-endpoint

git checkout cli-license-scan-endpoint

Do whatever you normally do to get core running on your system. You'll need to have minio, the agent and a faktory worker running, so you might also need to do:

docker compose up -d s3 agent faktory-worker

You'll want to watch these logs to make sure that the right job is getting queued:

docker compose logs --follow faktory-worker

test directory setup

Make a simple yarn project with a single vendored dependency:

mkdir ~/fossa/license-scan-dirs/archive-upload-with-target
cd ~/fossa/license-scan-dirs/archive-upload-with-target
mkdir yarn-package

Put this in fossa-deps.yml:

vendored-dependencies:
  - name: yarn-package
    path: yarn-package
    version: 1.0.2

Put this in yarn-package/package.json:

{
  "name": "yarn-testing",
  "version": "1.0.0",
  "description": "",
  "main": "index.js",
  "scripts": {
    "test": "echo \"Error: no test specified\" && exit 1"
  },
  "author": "",
  "license": "MIT",
  "dependencies": {
    "bson": "4.5.1"
  },
}
cd yarn-package
yarn install
rm -rf node_modules

Now do the same for a directory with multiple archives:

mkdir ~/fossa/license-scan-dirs/archive-upload-with-multple-targets
cd ~/fossa/license-scan-dirs/archive-upload-with-multiple-targets
mkdir first-yarn-package
mkdir second-yarn-package

Put this in fossa-deps.yml:

vendored-dependencies:
  - name: first-yarn-package
    path: first-yarn-package
    version: 1.0.2
  - name: second-yarn-package
    path: second-yarn-package
    version: 2.0.1

in first-yarn-package/package.json:

{
  "name": "yarn-testing",
  "version": "1.0.0",
  "description": "",
  "main": "index.js",
  "scripts": {
    "test": "echo \"Error: no test specified\" && exit 1"
  },
  "author": "",
  "license": "MIT",
  "dependencies": {
    "bson": "4.5.1"
  },
  "devDependencies": {
    "@storybook/addon-actions": "6.3.7",
    "@storybook/addon-docs": "6.3.7"
  }
}

in second-yarn-package/package.json:

{
  "name": "yarn-testing-2",
  "version": "1.0.0",
  "description": "",
  "main": "index.js",
  "scripts": {
    "test": "echo \"Error: no test specified\" && exit 1"
  },
  "author": "",
  "license": "Apache-2.0",
  "dependencies": {
    "bson": "4.5.1"
  },
  "devDependencies": {
    "@storybook/addon-actions": "6.3.7",
    "@storybook/addon-docs": "6.3.7"
  }
}
cd first-yarn-package
yarn install
rm -rf node_modules
cd ../second-yarn-package
yarn install
rm -rf node_modules

Run the CLI

Without any changes, run the CLI:

export FOSSA_API_KEY='valid api key for your dev setup'
cabal run fossa -- analyze -e http://localhost:9578 ~/fossa/license-scan-dirs/archive-upload-with-target

This should successfully go through the previously existing archive upload workflow.

Now edit src/App/Fossa/ManualDeps.hs. On line 92 change ArchiveUpload to CLILicenseScan

  let archiveOrCLI = CLILicenseScan

Edit fossa-deps.yml and change the version of the dependency from 1.0.1 to 1.0.2.

run the scan again:

cabal run fossa -- analyze -e http://localhost:9578 ~/fossa/license-scan-dirs/archive-upload-with-target

This time it should run a license scan on the yarn-package directory and upload the results to your local minio.

There should be no difference in the UI. The license scan result from a cli-side license scan should be the same as a license scan done on the servers.

However, what happens on the back end is very different.

The archive upload workflow uploads the contents of the vendored dependency to archive_components/archive+<org ID>/<locator>.

So, if your ORG ID is 1, you'll see a file uploaded to archive_components/archive+1/yarn-package$1.0.1.

Download that file and gunzip it. Check that you see the contents of the yarn-package directory.

The CLI license scan workflow scans the contents of the vendored dependency and uploads the results of that license scan to archive_license_data/archive+<org ID>/locator.

So, if your ORG ID is 1, you'll see a file uploaded to archive_license_data/archive+1/yarn-package$1.0.2.

Download that file and gunzip it. Check that you see a JSON file containing the results of a license scan. It should look like this:

mv ~/Downloads/yarn-package\$1.0.1 ~/Downloads/yarn-package\$1.0.1.gz; gunzip ~/Downloads/yarn-package\$1.0.1.gz; cat ~/Downloads/yarn-package\$1.0.1
{"Name":"yarn-package","Type":"cli-license-scanned","LicenseUnits":[{"Files":["yarn.lock"],"Name":"No_license_found","Type":"LicenseUnit","Dir":"","Info":{"Description":""},"Data":[{"path":"yarn.lock","ThemisVersion":"db4cf6c4b22e29bc4f2a4bd48488b8d3c2a3f5d9","Copyrights":null,"match_data":null,"Copyright":null}]},{"Files":["package.json"],"Name":"mit","Type":"LicenseUnit","Dir":"","Info":{"Description":""},"Data":[{"path":"package.json","ThemisVersion":"db4cf6c4b22e29bc4f2a4bd48488b8d3c2a3f5d9","Copyrights":null,"match_data":[{"index":0,"match_string":"MIT\",","length":4,"location":201}],"Copyright":null}]}]}

The contents of the file are just the JSON output.

In both cases, the UI should show an MIT license found for the yarn-package dependency.

Do the same thing for the project in ~/fossa/license-scan-dirs/archive-upload-with-multple-targets:

cd ~/fossa/license-scan-dirs/archive-upload-with-multple-targets

cabal run fossa -- analyze -e http://localhost:9578 ~/fossa/license-scan-dirs/archive-upload-with-multiple-targets

You should see two new files in minio in archive_components/archive+<org ID>.

You should also see two CliLicenseScannedBuild tasks run in your faktory logs.

You should see an MIT license for first-package and an Apache-2.0 license for second-package.

Risks

References

Checklist

  • I confirmed tests are not viable
  • If this PR introduced a user-visible change, I added documentation into docs/.
  • I updated Changelog.md if this change is externally facing. If this PR did not mark a release, I added my changes into an # Unreleased section at the top.
  • I updated *schema.json if I have made changes for .fossa.yml, fossa-deps.{json, yaml, yml}. You may also need to update these if you have added/removed new dependency (e.g. pip) or analysis target type (e.g. poetry).
  • I linked this PR to any referenced GitHub issues, if they exist.

Copy link
Contributor

@meghfossa meghfossa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still have to go through uploading logic, and testing. Leaving some optional nits, and other comments for now.

src/Discovery/Archive.hs Outdated Show resolved Hide resolved
src/App/Fossa/FossaAPIV1.hs Show resolved Hide resolved
src/Srclib/Types.hs Outdated Show resolved Hide resolved
src/Srclib/Types.hs Outdated Show resolved Hide resolved
src/Srclib/Types.hs Outdated Show resolved Hide resolved
src/App/Fossa/ManualDeps.hs Outdated Show resolved Hide resolved
src/App/Fossa/FossaAPIV1.hs Outdated Show resolved Hide resolved
src/App/Fossa/LicenseScanner.hs Outdated Show resolved Hide resolved
src/App/Fossa/ManualDeps.hs Outdated Show resolved Hide resolved
src/App/Fossa/LicenseScanner.hs Outdated Show resolved Hide resolved
Copy link
Contributor

@skilly-lily skilly-lily left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One required change, specifically the extractXZippedBinary comment.

Comment on lines 94 to 101
compressedThemisIndex <- extractEmbeddedBinary ThemisIndex
let decompressedThemisIndex =
BinaryPaths
{ binaryPathContainer = binaryPathContainer compressedThemisIndex
, binaryFilePath = $(mkRelFile "index.gob")
}
sendIO $ extractLzma (toPath compressedThemisIndex) (toPath decompressedThemisIndex)
pure $ ThemisBins themisActual $ applyTag @ThemisIndex decompressedThemisIndex
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have the entire binary file in memory, it seems a little weird to dump it out to a file and then unzip it on-disk. I think it's best if we create some extractXZippedBinary function which does similar work to extractEmbeddedBinary, but runs in-memory lzma unzipping on the embedded bytestring and dumps that directly to the file.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like Lzma.decompress is what we're looking for.

(Has (Lift IO) sig m, Has Diagnostics sig m) =>
ApiOpts ->
ArchiveComponents ->
m (Maybe C.ByteString)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the return type a bytestring? Generally, we maintain that the API function should handle all data transformation and parsing from the server before returning that value to the caller.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that the old archive upload function did that, let's leave a TODO to fix this later.

Copy link
Contributor Author

@spatten spatten Mar 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This endpoint just returns a 201 if there's a success, and the only place that we call this discards the return value.

Should the return type be m () or something like that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but let's make it a todo to keep this ticket smaller.

src/App/Fossa/LicenseScanner.hs Outdated Show resolved Hide resolved
src/App/Fossa/LicenseScanner.hs Outdated Show resolved Hide resolved
src/App/Fossa/RunThemis.hs Show resolved Hide resolved
Copy link
Contributor

@skilly-lily skilly-lily left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drive-by, I should probably go though this more thoroughly if you want my full review on it.

src/App/Fossa/EmbeddedBinary.hs Outdated Show resolved Hide resolved
src/App/Fossa/FossaAPIV1.hs Outdated Show resolved Hide resolved
src/App/Fossa/RunThemis.hs Outdated Show resolved Hide resolved
src/Srclib/Types.hs Outdated Show resolved Hide resolved
src/App/Fossa/EmbeddedBinary.hs Outdated Show resolved Hide resolved
src/App/Fossa/EmbeddedBinary.hs Outdated Show resolved Hide resolved
src/App/Fossa/EmbeddedBinary.hs Outdated Show resolved Hide resolved
@skilly-lily
Copy link
Contributor

Superseded by #847, couldn't rebase on latest master due to merge commits in the history.

@skilly-lily skilly-lily deleted the run-themis-on-deps branch March 22, 2022 17:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants