Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to get enhanced package data in API #157

Merged
merged 6 commits into from
Aug 9, 2023

Conversation

TG1999
Copy link
Contributor

@TG1999 TG1999 commented Aug 2, 2023

No description provided.

@TG1999 TG1999 changed the title Add option to enhanced package data in API Add option to get enhanced package data in API Aug 2, 2023
Comment on lines 165 to 171
def to_representation(self, instance):
data = super().to_representation(instance)
request = self.context["request"]
if hasattr(request, "query_params") and request.query_params.get("enhanced", False):
from packagedb.api import get_enhanced_package
data["enhanced_package"] = get_enhanced_package(instance)
return data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a hack. Why not use a proper field on the serializer?

Copy link
Contributor

@JonoYang JonoYang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TG1999

Please take a look at the new API endpoint on packages: https://github.com/nexB/purldb/blob/main/packagedb/api.py#L470

We will want to add the function to enhance package data there, rather than overriding to_representation().

Signed-off-by: Tushar Goel <tushar.goel.dav@gmail.com>
Signed-off-by: Tushar Goel <tushar.goel.dav@gmail.com>
@TG1999 TG1999 force-pushed the add_enhanced_package_data_in_api branch from d8155cb to b34edf8 Compare August 7, 2023 14:21
Signed-off-by: Tushar Goel <tushar.goel.dav@gmail.com>
@TG1999 TG1999 requested a review from JonoYang August 7, 2023 17:13
Copy link
Contributor

@JonoYang JonoYang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TG1999 I recommended some changes. I noticed that when the enhanced package option is enabled in filter_by_checksums, that packages that are not enhanced are skipped and not returned. I think it may be good for us to have get_enhanced_package to return package data , even if it was not enhanced.

@pombredanne Does this sound like a good idea?

'sha1': 'testsha1-4',
'md5': 'testmd5-3',
'size': 100,
'package_content': 5,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import PackageContentType and use PackageContentType.BINARY instead of 5.

@@ -265,14 +265,31 @@ def setUp(self):
self.package3 = Package.objects.create(**self.package_data3)
self.package3.refresh_from_db()

self.package_data4= {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should add two new maven packages, one of them a binary package, the other a source_archive package. These packages should have the same purl fields (type, namespace, name, version) with the qualifiers being different. On the binary package, leave out license information, and on the source archive package, have license information. We want to see the enhanced package data returned for the binary package in the test results when we enable the option.

packagedb/api.py Outdated
@@ -522,6 +522,12 @@ def filter_by_checksums(self, request, *args, **kwargs):

qs = Package.objects.filter(q)
paginated_qs = self.paginate_queryset(qs)
get_enhanced_package_data = request.query_params.get('get_enhanced_package_data', False)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets use enhance_package_data as the option name.

@pombredanne
Copy link
Contributor

@pombredanne Does this sound like a good idea?

@TG1999 I recommended some changes. I noticed that when the enhanced package option is enabled in filter_by_checksums, that packages that are not enhanced are skipped and not returned. I think it may be good for us to have get_enhanced_package to return package data , even if it was not enhanced.

@pombredanne Does this sound like a good idea?

This is a must. By default we should always return something, enhanced or not. And we should never silently not return things that were requested.

Signed-off-by: Tushar Goel <tushar.goel.dav@gmail.com>
Copy link
Contributor

@JonoYang JonoYang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TG1999 There are some things left to change. Also, I think we are missing the changes to https://github.com/nexB/purldb/blob/main/packagedb/models.py#L1061

packagedb/api.py Outdated
@@ -522,6 +522,17 @@ def filter_by_checksums(self, request, *args, **kwargs):

qs = Package.objects.filter(q)
paginated_qs = self.paginate_queryset(qs)
enhance_package_data = request.query_params.get('enhance_package_data', False)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the enhance_package_data flag should be sent along with the data in the post request, rather than using query_params here

Suggested change
enhance_package_data = request.query_params.get('enhance_package_data', False)
enhance_package_data = data.get('enhance_package_data', False)

expected = self.get_test_loc('api/package-filter_by_checksums-expected.json')
self.check_expected_results(response.data['results'], expected, fields_to_remove=["url", "uuid", "resources"], regen=False)
self.check_expected_results(response.data['results'], expected, fields_to_remove=["url", "uuid", "resources", "package_sets",], regen=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self.check_expected_results(response.data['results'], expected, fields_to_remove=["url", "uuid", "resources", "package_sets",], regen=True)
self.check_expected_results(response.data['results'], expected, fields_to_remove=["url", "uuid", "resources", "package_sets",], regen=False)

enhanced_response = self.client.post('/api/packages/filter_by_checksums/?enhance_package_data=true', data=data)
self.assertEqual(5, len(enhanced_response.data['results']))
expected = self.get_test_loc('api/package-filter_by_checksums-enhanced-package-data-expected.json')
self.check_expected_results(enhanced_response.data['results'], expected, fields_to_remove=["url", "uuid", "resources", "package_sets",], regen=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self.check_expected_results(enhanced_response.data['results'], expected, fields_to_remove=["url", "uuid", "resources", "package_sets",], regen=True)
self.check_expected_results(enhanced_response.data['results'], expected, fields_to_remove=["url", "uuid", "resources", "package_sets",], regen=False)

Signed-off-by: Tushar Goel <tushar.goel.dav@gmail.com>
Copy link
Contributor

@JonoYang JonoYang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TG1999 Thanks for addressing the review comments. I have a last thing for you to do on the PR.

@@ -517,11 +517,24 @@ def filter_by_checksums(self, request, *args, **kwargs):
for field, value in data.items():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to pop enhance_package_data from data before iterating through it, then use enhance_package_data afterwards.

Suggested change
for field, value in data.items():
enhance_package_data = data.pop('enhance_package_data', False)
for field, value in data.items():

packagedb/api.py Outdated
@@ -517,11 +517,24 @@ def filter_by_checksums(self, request, *args, **kwargs):
for field, value in data.items():
# We create this intermediate dictionary so we can modify the field
# name to have __in at the end
if field in ["enhance_package_data"]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if field in ["enhance_package_data"]:

packagedb/api.py Outdated
@@ -517,11 +517,24 @@ def filter_by_checksums(self, request, *args, **kwargs):
for field, value in data.items():
# We create this intermediate dictionary so we can modify the field
# name to have __in at the end
if field in ["enhance_package_data"]:
continue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
continue

packagedb/api.py Outdated
d = {f'{field}__in': value}
q |= Q(**d)

qs = Package.objects.filter(q)
paginated_qs = self.paginate_queryset(qs)
enhance_package_data = data.get("enhance_package_data", False)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this, since we're going to pop enhance_package_data from data above.

Suggested change
enhance_package_data = data.get("enhance_package_data", False)

    * Return unenhanced package data for packages not in a package set or if they do not have package content
    * Refactor code in filter_by_checksums()

Signed-off-by: Jono Yang <jyang@nexb.com>
@JonoYang JonoYang force-pushed the add_enhanced_package_data_in_api branch from ea2d591 to 5cc124c Compare August 9, 2023 20:16
@JonoYang
Copy link
Contributor

JonoYang commented Aug 9, 2023

@TG1999 I've made the changes and committed them to your branch, and I will merge it into main. Thank you again for your PR!

@JonoYang JonoYang merged commit 7a9e4f9 into main Aug 9, 2023
8 checks passed
@JonoYang JonoYang deleted the add_enhanced_package_data_in_api branch August 9, 2023 22:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants