-
-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LibGfx+LibPDF: Add support for palettized JPEG2000 files #25741
Conversation
From |
62644cf
to
0ecfd72
Compare
|
0ecfd72
to
1321626
Compare
In all PDFs I've seen so far, palettized JPEG2000 files in a PDF use a PDF /Indexed color space, and the PDF expects to get palette index data as grayscale from the JPEG2000 decoder. So add an option to request that behavior. (This means that a palettized JPEG2000 file might use a different palette when inside a PDF than when outside it!) We still don't support decoding standalone palettized JPEG2000 files, only palettized JPEG2000 files embedded in PDFs.
1321626
to
5b0a43b
Compare
return Error::from_string_literal("JPEG2000ImageDecoderPlugin: Only full opacity channel supported yet"); | ||
has_alpha = true; | ||
} | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The diff doesn't make this very visible, but everything in the else
is everything that was on the LHS, just indented one level. (And with the // If you make this more
comment added.)
The approach is to make a distinction between components and channels (mirroring spec language). If there's no palette, each component directly maps to a channel. (This uses move semantics, so there's next to no data copying happening.) If there's a palette, it uses the component mapping box to map components to channels, likely using the palette for some channels. So a single component of palette indices is converted to N (with N usually 3, sometimes 4) full channels. If a palette is present, this copying has some overhead, but it's conceptually simple. Things that should just work: * Palette indices of any bitness * Palette values with bpp in [1..16] (existing support handles) * Two-component images where one component is a palette index and the other is an alpha channel (don't know how to make those, though) * Palettes for CMYK files (and for other color spaces) Note that this codepath is only used for standalone palettized JPEG2000 flies, not for JPEG2000 files in PDFs. (To create ref-indexed.png, I opened openjpeg-lossless-indexed-u8-rgb-u8.jp2 in Photoshop and saved it as PNG.)
Needed for 0000591.pdf, pages 5, 9, 10, 11.
No behavior change.
No effective behavior change, since the invalid value was multiplied with other values that are zero when this case is hit. Serendipitously covered by indexed-small.jp2 that's added in a later commit in this PR. This makes the UBSan bot happy with that test.
I created indexed-small.jp2 with this script: https://github.com/nico/hack/blob/d4772bae/make_palettized_jpeg2000.py I created jpeg2000-indexed-small.pdf with a local script that creates a PDF file embedding a jpeg2000 file, and then manually changed the indexed color space in the PDF to use 0x7f bytes for all bytes in the palette that were 0xff. Since PDF renderers are supposed to use the palette that's in the PDF file, not the one that's in the JPEG2000 data, this means that the image shows up darker in the PDF than when viewing the file standalone. (It'd also be possible to change the palette colors to something completely different. I just went with "make it slightly darker" since that seems less confusing.)
5b0a43b
to
bc12537
Compare
This adds support for JPEG2000 files with a palette.
In PDFs, these work fairly differently from when they're standalone: In PDFs, the PDF engine loads the palette indices from the jpeg2000 data and then applies a palette that is store in the PDF data. This adds support for both modes.
(PDFs allow using the color space from the jpeg2000 file instead of using an explicit color space. PDF viewers are very inconsistent what they do when doing this with palettized JPEG2000 files. Acrobat refuses to show such images. PDFium and Preview.app then use the palette in the file. I haven't tested yet what happens when using an indexed PDF with a /DeviceGray color space – I'd expect that Acrobat might then show the indices as grayscale data. When setting the color space of the indexed jpeg2000 to /DeviceRGB, Preview.app and Acrobat don't show the image, but PDFium shows the converted data with the palette in the jpeg2000 data (! This seems kind of incorrect; I'll file a bug for this))
Also adds a quirk for a technically invalid construct that happens in one of my test PDFs.