-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CCITT group 4 (Fax4) decoding support #229
base: master
Are you sure you want to change the base?
Changes from 3 commits
efb23b6
ec5a2e6
e3900f1
f475d6d
fbe26e9
de076fa
2cfe50f
d6ab46b
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -7,6 +7,7 @@ use crate::tags::{ | |
CompressionMethod, PhotometricInterpretation, PlanarConfiguration, Predictor, SampleFormat, Tag, | ||
}; | ||
use crate::{ColorType, TiffError, TiffFormatError, TiffResult, TiffUnsupportedError, UsageError}; | ||
use fax; | ||
use std::io::{self, Cursor, Read, Seek}; | ||
use std::sync::Arc; | ||
|
||
|
@@ -368,6 +369,7 @@ impl Image { | |
} | ||
|
||
fn create_reader<'r, R: 'r + Read>( | ||
&self, | ||
reader: R, | ||
photometric_interpretation: PhotometricInterpretation, | ||
compression_method: CompressionMethod, | ||
|
@@ -447,6 +449,25 @@ impl Image { | |
|
||
Box::new(Cursor::new(data)) | ||
} | ||
CompressionMethod::Fax4 => { | ||
let width = u16::try_from(self.width)?; | ||
let height = u16::try_from(self.height)?; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should these be the chunk dimensions rather than the image dimensions? |
||
let mut out: Vec<u8> = Vec::with_capacity(usize::from(width) * usize::from(height)); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I have two concerns here:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm. Going to push back a bit on (2), but you're the library maintainer so LMK if I'm looking at this wrong. I expect that this crate reports back what the tiff file says in its header, which is
Does that make sense? The alternatives would be to alter the following to special case this and override what's coming from the header: https://github.com/image-rs/image-tiff/blob/master/src/decoder/image.rs#L351 Alternatively we could add an extra data type to represent the actual output. I'm OK with any of these options, just want to lay them out. What would you like as both the author/maintainer of this library and the primary consumer of its API in the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This crate should already decode To support the image crate use case, I'd expect that this create should either exposes an optional flag to expand sub 8-bit channels (like PNG does), or else not perform any expansion and have code in the image crate to do the conversion from packed 1-bit per sample into L8 encoded. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm working at getting packed 1-bit samples working here and, TBH, it's quite a bit of fiddly work. I've got the decoder portion outputting 1-bit samples but the rest of the crate seems to assume it's byte-addressable. I am looking at the chunking (de-chunking?) code in To step back a bit: especially if the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK, I've looked a bit further and There are three different code paths:
All of these assume individually addressable pixels. What kind of changes to these abstractions do you propose we make to get this working? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I realized the reason I thought there was sub-byte sample support was because there's a stalled-out PR for it that I lost track of. I'll hopefully have a bit more time this weekend, but briefly:
A final note is to watch out for integer overflow when using There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK. I'll be away all next week but I'd like to pick this up again after that. Thanks for the context here. |
||
|
||
let mut buffer = Vec::with_capacity(usize::try_from(compressed_length)?); | ||
reader.take(compressed_length).read_to_end(&mut buffer)?; | ||
|
||
// all extant tiff/fax4 decoders I've found always assume that the photometric interpretation | ||
// is `WhiteIsZero`, ignoring the tag. ImageMagick appears to generate fax4-encoded tiffs | ||
// with the tag incorrectly set to `BlackIsZero`. | ||
fax::decoder::decode_g4(buffer.into_iter(), width, Some(height), |transitions| { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could you pass There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It'd be
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The bit reader in the fax decoder pulls single bytes anway. I started changing the api to accept an Iterator of Result<u8, E>. (so There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You could also see if taking a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. FWIW, I do not think it is worth worrying too much about micro-optimizations here. It makes sense to handle malicious inputs without blowing up the heap, but in practice all of the images I'm seeing are crappy low-res scans of checks from god-knows-what device a regional bank purchased in 1993. Really doubt there are many images of this format in the wild that demand fully optimized best-case performance |
||
out.extend(fax::decoder::pels(transitions, width).map(|c| match c { | ||
fax::Color::Black => 255, | ||
fax::Color::White => 0, | ||
})) | ||
}); | ||
Box::new(Cursor::new(out)) | ||
} | ||
method => { | ||
return Err(TiffError::UnsupportedError( | ||
TiffUnsupportedError::UnsupportedCompressionMethod(method), | ||
|
@@ -633,7 +654,7 @@ impl Image { | |
|
||
let padding_right = chunk_dims.0 - data_dims.0; | ||
|
||
let mut reader = Self::create_reader( | ||
let mut reader = self.create_reader( | ||
reader, | ||
photometric_interpretation, | ||
compression_method, | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please pass in the relevant width/height as arguments rather than taking
self