-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DAS-2161: Updates libs with no user visible changes. #16
Conversation
GDAL adds a NoData value to the png metadata when the colormap has 256 values and only one of those is transparent: RGBA(0,0,0,0). pillow 10.0.0 always returned a 256 length colormap from the quantized image irrespective of the number of colors requested. Pillow 10.x.0 changed this behavior to only return a palete of the unique values in the quantized image. By beginning with a 256-element all-black colortable and writing the truncated one over it, we retain the previous behavior which allows for some images to have a NoData Value in the PNG metadata.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the code changes make sense. I have a couple of questions (that are probably are just easy things you'll tell me aren't issues).
The unit tests ran and passed locally, as did the regression test suite. Nice!
I'm probably in the opposite camp about not releasing. I'd have probably opted to do so, so we can test these changes in isolation as a patch release. I do appreciate that they shouldn't be significant, and that it leads to more deployment wrangling, but we are changing a bunch of underlying code with all those packages updates.
@@ -286,7 +287,7 @@ def get_color_map_from_image(image: Image) -> dict: | |||
|
|||
""" | |||
color_tuples = np.array(image.getpalette(rawmode='RGBA')).reshape(-1, 4) | |||
color_map = {} | |||
color_map = all_black_color_map() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay then. This is me gathering my thoughts (please tell me if anything below is wonky):
all_black_color_map
returns a dictionary of 256 key/value pairs. All have an RGBA tuple that is black.- Below this line you then iterate through all the values in
color_tuples
and assign those values to their appropriate index in the dictionary (but it essentially starts at 0 and fills the first N indices).
So, the questions I have:
- Will you ever get a time when
color_tuples
is significantly less than 256, and you have multiple black values at the end of the colour map? Does that matter? - Will you ever get a time when
colour_tuples
is more than 256? Would that be a problem? (It seems like the code will all still work, but maybenodata
would not be set as expected?)
Possibly nitpicky thought (feel free to ignore), you could use enumerate
below maybe:
for idx, color_tuple in enumerate(color_tuples):
color_map[idx] = tuple(color_tuple)
(And yes, I know I used "color", but only to be consistent 😉)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will you ever get a time when color_tuples is significantly less than 256, and you have multiple black values at the end of the colour map? Does that matter?
Yes, we have gotten that before and will still see that. It does not matter, because there are no values in the data that will have those indexes. For example run gdalinfo
on the ASTGTMV003_N00E022_dem.png from the regression tests.
Will you ever get a time when colour_tuples is more than 256? Would that be a problem? (It seems like the code will all still work, but maybe nodata would not be set as expected?)
No, because we only ask for 254 values when quantizing .
I just tried with 300 for giggles and quantize won't go above 256: ValueError: bad number of colors
Also No Data is set right before reprojection, so it would end up overwriting the last color in the table. Which is also why I only ask for 254 values, so that we have a transparent and no data index available without ruining any of the quantized values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly nitpicky thought (feel free to ignore), you could use
enumerate
below maybe:for idx, color_tuple in enumerate(color_tuples): color_map[idx] = tuple(color_tuple)
Sure that's cleaner.
Oh, I looked at the Snyk issues. They are kind of tricky - looks like the packages need a version of Python that Snyk isn't running. Snyk looks like it is using Python 3.9. The Python version used to be non-configurable in the way we are using Snyk, and I think that might still be the case. It's a total pain and is a reason why some projects get false positives. |
I got totally side tracked and didn't look into it. Thanks for the tip. |
With the Snyk thing, there is a little bit of this being a prior bee in my bonnet. But I looked up the "Python version not supported" issue via the page with the error codes that is linked to from "About this error". That mentioned that it was likely the version of one of the packages needing a different version of Python. To dig out the version of Python being used, when you click on the gear icon, move from the "Settings" tab to the "Overview" one. That then lists Python 3.9. But, to my knowledge, the version of Python is limited to being the same throughout all our Snyk projects. I had raised that the Python version being old was causing false positives with the CI/CD team for EED-3 before, and (if I remember rightly) Snyk got us on to a beta version of Snyk that allowed for us to configure the global version of Python being used (I think the CI/CD team bumped us from 3.7.2 to 3.9), but we still don't have a granular per-project control that ultimately we need. TL;DR - I'm kind of unsure how to proceed here. I don't think we can get around the Snyk issues as they stand. |
I just did this again because I was sure I did this. But this time I saw it. I must have been typing it wrong (again and again). But we can fix with a .synk file? |
@owenlittlejohns following discussions with Jon and Venku. I'm adding this snyk file so they can bump the default version to 3.9. But there's still some wonder as to why snyk is actually failing at 3.9 and 3.11 since 3.7 isn't used anywhere in the project |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd still probably push to release this. But otherwise I think this is looking good. (Or as good as Snyk will let us make it look)
# Snyk (https://snyk.io) policy file, patches or ignores known vulnerabilities. | ||
version: v1.25.0 | ||
language-settings: | ||
python: "3.7" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay - so just vocalising what I think we've got here: the .snyk
file is preserving the version of Python that Snyk was previously using prior to the bump to Python 3.9.
This seems like the way to go for now, but it is a bit iffy that we are almost certainly getting different package versions being scanned by Snyk than we are actually installing in the Docker image. (Side note - it's nice that a .snyk
file is actually working here, I don't think that was always the case)
Description
This ticket was to bump the library versions and discover why regression tests
stopped passing.
There was a change in some image metadata when the pillow (PIL)
library was updated from v10.0.0 to v10.x.0.
The culprit is that GDAL adds a
No Data
value to the PNG metadata only whenthe colormap has 256 values and a single one of those is fully transparent, e.g
RGBA(0,0,0,0).
pillow 10.0.0 always returned a 256 length colormap from the quantized image
irrespective of the number of colors requested.
Pillow 10.x.0 changed this behavior to only return a palette of the unique
values in the quantized image.
The regression tests failed because the shorter colortable passed to gdal did
not meet GDAL's (presumed, not documented) feature of adding a No Data value to
the output metadata.
By beginning with a 256-element all-black color table and writing extracted,
truncated one into it, we retain the previous behavior by having a full length
256 value color table, which allows for some images to have a NoData Value in
the PNG metadata.
Jira Issue ID
DAS-2161
Local Test Steps
Pull branch
Look that the pip_requirements.txt file has updated libraries
Run build and test the image.
Deploy image to Harmony-In-A-Box and run the HyBIG regression tests locally.
They should pass as they did before with no changes to the regression image or
tests.
No release. I think these changes can be picked up in the next release since there are no user facing changes.
PR Acceptance Checklist
CHANGELOG.md
updated to include high level summary of PR changes.docker/service_version.txt
updated if publishing a release.Documentation updated (if needed).