Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generated PURL is not providing enough information to derive a valid download URL #149

Closed
DennisClark opened this issue Jul 10, 2024 · 5 comments
Assignees
Labels
bug Something isn't working HighPriority High Priority integration Integration with other applications

Comments

@DennisClark
Copy link
Member

Here is a valid download URL: https://github.com/apache/nifi/archive/refs/tags/rel/nifi-2.0.0-M3.tar.gz

I successfully created a Package in DejaCode (see attached screenshot) using that download URL and it assigned this PURL to that package: pkg:github/apache/nifi@2.0.0-M3

In a different dataspace, I attempted to create a Package using that PURL, and got this error message:
Error: Could not download content: https://github.com/apache/nifi/archive/2.0.0-M3.tar.gz

so it seems that the PURL does not have the information required to derive the complete, valid download URL.

Package created from a download URL
@DennisClark DennisClark added bug Something isn't working integration Integration with other applications HighPriority High Priority labels Jul 10, 2024
@tdruez
Copy link
Contributor

tdruez commented Jul 11, 2024

>>> from packageurl.contrib import purl2url, url2purl

>>> download_url = "https://github.com/apache/nifi/archive/refs/tags/rel/nifi-2.0.0-M3.tar.gz"
>>> print(url2purl.get_purl(download_url))
pkg:github/apache/nifi@2.0.0-M3

>>> purl = "pkg:github/apache/nifi@2.0.0-M3"
>>> purl2url.get_download_url(purl)
'https://github.com/apache/nifi/archive/2.0.0-M3.tar.gz'

The generated download URL is invalid as the tag value is rel/nifi-2.0.0-M3 but it is lost when generating the purl.
We cannot construct https://github.com/apache/nifi/archive/refs/tags/rel/nifi-2.0.0-M3.zip from just pkg:github/apache/nifi@2.0.0-M3.

In a way, the generated purl is incorrect as the version 2.0.0-M3 do not exist in GitHub, see https://github.com/apache/nifi/releases
It only exists there as the rel/nifi-2.0.0-M3 tag https://github.com/apache/nifi/tags

Maybe the correct purl here should be pkg:github/apache/nifi@rel/nifi-2.0.0-M3

>>> purl2url.get_download_url("pkg:github/apache/nifi@rel/nifi-2.0.0-M3")
'https://github.com/apache/nifi/archive/rel/nifi-2.0.0-M3.tar.gz'   # This URL works!

Alternatively, we may want to store the tag as a qualifier, such as:
pkg:github/apache/nifi@2.0.0-M3?tag=rel/nifi-2.0.0-M3
But this would require additional support in the packageurl library.

@pombredanne
Copy link
Member

for https://github.com/apache/nifi/releases/tag/rel%2Fnifi-1.27.0 where the tag is effectively a weird rel/nifi-1.27.0 and same for rel/nifi-2.0.0-M3. We should IMHO use this as a version, e.g., use the tag exactly as it is and not trying to change it in anyway.

>>> p=PackageURL.from_string("pkg:github/apache/nifi@rel/nifi-1.27.0")
>>> p
PackageURL(type='github', namespace='apache', name='nifi', version='rel/nifi-1.27.0', qualifiers={}, subpath=None)
>>> print(p)
pkg:github/apache/nifi@rel/nifi-1.27.0

For the m3 tag, we should have this (and this is NOT what works today):

>>> from packageurl.contrib import purl2url, url2purl

>>> download_url = "https://github.com/apache/nifi/archive/refs/tags/rel/nifi-2.0.0-M3.tar.gz"
>>> print(url2purl.get_purl(download_url))
pkg:github/apache/nifi@rel/nifi-2.0.0-M3

>>> purl = "pkg:github/apache/nifi@2.0.0-M3"
>>> purl2url.get_download_url(purl)
'https://github.com/apache/nifi/archive/2.0.0-M3.tar.gz' # <-- this does not exist alright
                                                         # this tag does NOT exist

# this would work with  proper URL, this does exist alright
>>> purl = "pkg:github/apache/nifi@rel/nifi-2.0.0-M3"
>>> purl2url.get_download_url(purl)
`https://github.com/apache/nifi/archive/refs/tags/rel/nifi-2.0.0-M3.tar.gz`

So to recap:

  1. the url2purl is faulty
  2. we should always take the tags as-is, even if rel/ or v prefixed
  3. to allow relating a package repo tag to a package version is something entirely different like in purl2vcs in purldb

@tdruez
Copy link
Contributor

tdruez commented Jul 25, 2024

tdruez added a commit that referenced this issue Jul 25, 2024
Signed-off-by: tdruez <tdruez@nexb.com>
tdruez added a commit that referenced this issue Jul 25, 2024
* Upgrade packageurl-python to latest 0.15.5 version #153

Signed-off-by: tdruez <tdruez@nexb.com>

* Upgrade packageurl-python to latest 0.15.6 version #149

Signed-off-by: tdruez <tdruez@nexb.com>

---------

Signed-off-by: tdruez <tdruez@nexb.com>
@tdruez
Copy link
Contributor

tdruez commented Jul 25, 2024

Fix merged and deployed.

@tdruez tdruez closed this as completed Jul 25, 2024
@DennisClark
Copy link
Member Author

Fix confirmed. Thanks @tdruez !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working HighPriority High Priority integration Integration with other applications
Projects
None yet
Development

No branches or pull requests

3 participants