Add Canon CR3 raw file support #41

rolandd · 2025-08-28T02:31:55Z

I have a Canon camera that produces CR3 raw files and I wanted nom-exif support
for EXIF extraction so I hacked this up. I tried to imitate the HEIC/HEIF handling
since CR3 files also use ISOBMFF, but I definitely don't have a deep understanding
of the right way to implement this, so please let me know if anything needs to be
fixed up.

As more file formats are added, it gets messy if exif.rs has code that handles the specific internals of each file format. Move most of the logic from heif_extract_exif() into heif::extract_exif_data() in heif.rs.

rolandd · 2025-08-28T20:04:25Z

One question: CR3 files have more useful EXIF data in a second "CMT2" box (in addition to the CMT1 box that holds the most basic info). What would be the best way to handle merging two discontiguous TIFF structures when parsing a single file?

mindeng · 2025-08-29T04:21:39Z

There are some code formatting issues; please refer to the output of cargo fmt --check for details.

mindeng · 2025-08-29T04:35:48Z

One question: CR3 files have more useful EXIF data in a second "CMT2" box (in addition to the CMT1 box that holds the most basic info). What would be the best way to handle merging two discontiguous TIFF structures when parsing a single file?

ExifIter/IfdIter should already provide the possibility to traverse multiple IFDs.

rolandd · 2025-09-02T20:42:27Z

Thanks, I took all the changes suggested by "cargo fmt" and pushed that out.

Will look at using the ifds member of ExifIter to handle the EXIF data in the CMT2 box as well, but I think this is ready to land if you think it would be useful.

rolandd · 2025-09-03T20:26:17Z

Looking more at the ExifIter implementation, I do see how it handles multiple IfdIters in the ifds member, but it seems we need a new way of constructing an ExifIter that takes something like a Vec<&[u8]> instead of an Option<&[u8]> to handle multiple discontiguous boxes with EXIF data in them? Does that make sense as a way to proceed?

mindeng · 2025-09-04T12:50:04Z

Looking more at the ExifIter implementation, I do see how it handles multiple IfdIters in the ifds member, but it seems we need a new way of constructing an ExifIter that takes something like a Vec<&[u8]> instead of an Option<&[u8]> to handle multiple discontiguous boxes with EXIF data in them? Does that make sense as a way to proceed?

If you want to process all IFDs in CR3, I think the relatively straightforward approach for now would be to parse all CMT* boxes, extract all the necessary Exif data, and then merge them into a single Vec<u8>. However, I'm not sure whether the offsets in the Exif data within CMT2 and subsequent boxes might cause issues—this point needs to be carefully verified.

rolandd · 2025-09-04T15:52:39Z

If you want to process all IFDs in CR3, I think the relatively straightforward approach for now would be to parse all CMT* boxes, extract all the necessary Exif data, and then merge them into a single Vec<u8>. However, I'm not sure whether the offsets in the Exif data within CMT2 and subsequent boxes might cause issues—this point needs to be carefully verified.

Just to be clear - CR3 handling should copy the EXIF data into a new consolidated buffer and then parse that with the existing code?

mindeng · 2025-09-05T08:47:51Z

Hi @rolandd

After studying the structure of CR3 files, I found that it differs somewhat from my initial understanding. The CMT1/CMT2 sections here are independent TIFF structures. Therefore, I have added a MultiExifIter to support this scenario.

Additionally, I have implemented ParseOutput for MultiExifIter. To enable traversing all TIFF/Exif data, you can parse the CR3 file as follows: let mut iter: MultiExifIter = parser.parse(ms).unwrap();.

Note that the current implementation of parse_multi_exif_iter is likely incomplete. Please refer to the comments and refine the relevant CR3 processing logic.

The code has been submitted to the PR branch.

Other issues:

In the uuid.rs file, CMT1/CMT2/CMT3 are hard-coded. I am uncertain whether this approach is appropriate. For example, are these names fixed? Could there be additional CMT* boxes (e.g., CMT4/CMT5) that need to be parsed?
The file size of testdata/canon-r6.cr3 is too large. Please try to strip out image data and other non-essential information, retaining only the minimal data required for parsing TIFF/Exif information.

rolandd · 2025-09-06T16:39:33Z

Thanks, will look at this code and work on it. Meanwhile I updated the branch with a minimized CR3 file (down to ~400K) that still has all the metadata and is accepted by exiftool.

I looked back at https://github.com/lclevy/canon_cr3 and think it's safe to hard-code CMT1/CMT2/CMT3/CMT4 (CMT4 is not in my code but seems to be used for GPS info, I'll try to get a test file with GPS data in it). But anyway I think Canon has probably frozen the file structure for now.

See https://github.com/lclevy/canon_cr3 for information about the CR3 file format. - Add testdata/canon-r6.cr3: minimized valid CR3 file based on an image from a Canon R6 camera - Update to detect CR3 files in file.rs based on brand name 'crx ' - Add bbox/cr3_moov.rs to handle 'moov' boxes and bbox/uuid.rs to handle Canon UUID sub-boxes that contain EXIF data for CR3 files - Add cr3.rs to handle extracting EXIF from CR3 files - Add basic test cases for CR3 parsing

rolandd · 2025-09-18T03:00:05Z

Sorry, just getting back to this, I'm a little confused what the intention is with MultiExifIter. Is the idea that the consumer of the library needs to know which file types are multi-exif and which are fully parsed with a simple ExifIter? Wouldn't it be more ergonomic to extend the ExifIter internals to allow for multiple TIFF / IFD structures as in CR3 files, but leave a unified API for consumers parsing image files?

mindeng · 2025-09-21T00:14:31Z

Sorry, just getting back to this, I'm a little confused what the intention is with MultiExifIter. Is the idea that the consumer of the library needs to know which file types are multi-exif and which are fully parsed with a simple ExifIter? Wouldn't it be more ergonomic to extend the ExifIter internals to allow for multiple TIFF / IFD structures as in CR3 files, but leave a unified API for consumers parsing image files?

Yes, reusing and extending ExifIter is indeed an approach that maintains API consistency, but it comes with two issues:

It would introduce additional complexity to ExifIter, making it harder to maintain and less aligned with the "Single Responsibility Principle."
In practice, users may need to explicitly know they are handling multiple Exif data blocks. For example, if there are tag conflicts between multiple Exif blocks, what strategy should be adopted to handle them? MultiExifIter includes a duplicate_strategy field to address such cases, currently defaulting to the IgnoreDuplicates strategy (perhaps a set method should be added to allow users to control this strategy).

Under the current design, even if users continue to use ExifIter to receive parsing results for CR3 files, it will still work and provide access to the first Exif block's information. If they want to process all Exif blocks, they can optionally use MultiExifIter to receive the results.
Therefore, introducing MultiExifIter at this stage does not break API compatibility and preserves the option to extend ExifIter with built-in support for multiple Exif blocks in the future (if truly necessary).
However, if we were to directly modify ExifIter now to support multiple Exif structures, I still have the two concerns mentioned above, so I cannot yet make a decision.

These are my thoughts. Feel free to discuss and share your ideas anytime. Thank you.

heif: Refactor heif_extract_exif() into heif.rs

1095a3e

As more file formats are added, it gets messy if exif.rs has code that handles the specific internals of each file format. Move most of the logic from heif_extract_exif() into heif::extract_exif_data() in heif.rs.

rolandd force-pushed the main branch from fee4118 to 23ab986 Compare September 2, 2025 20:40

rolandd force-pushed the main branch from 302ce70 to ef4dd3d Compare September 6, 2025 16:39

rolandd and others added 4 commits September 6, 2025 21:40

feat(exif): add MultiExifIter

83096a1

feat: impl ParseOutput for MultiExifIter

2a8d5bb

fix(bbox): lint issues

9f556ac

rolandd force-pushed the main branch from ef4dd3d to 9f556ac Compare September 7, 2025 04:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Canon CR3 raw file support #41

Add Canon CR3 raw file support #41

Uh oh!

rolandd commented Aug 28, 2025

Uh oh!

rolandd commented Aug 28, 2025

Uh oh!

mindeng commented Aug 29, 2025 •

edited

Loading

Uh oh!

mindeng commented Aug 29, 2025 •

edited

Loading

Uh oh!

rolandd commented Sep 2, 2025

Uh oh!

rolandd commented Sep 3, 2025

Uh oh!

mindeng commented Sep 4, 2025 •

edited

Loading

Uh oh!

rolandd commented Sep 4, 2025

Uh oh!

mindeng commented Sep 5, 2025

Uh oh!

rolandd commented Sep 6, 2025

Uh oh!

rolandd commented Sep 18, 2025

Uh oh!

mindeng commented Sep 21, 2025

Uh oh!

Uh oh!

Add Canon CR3 raw file support #41

Are you sure you want to change the base?

Add Canon CR3 raw file support #41

Uh oh!

Conversation

rolandd commented Aug 28, 2025

Uh oh!

rolandd commented Aug 28, 2025

Uh oh!

mindeng commented Aug 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mindeng commented Aug 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rolandd commented Sep 2, 2025

Uh oh!

rolandd commented Sep 3, 2025

Uh oh!

mindeng commented Sep 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rolandd commented Sep 4, 2025

Uh oh!

mindeng commented Sep 5, 2025

Uh oh!

rolandd commented Sep 6, 2025

Uh oh!

rolandd commented Sep 18, 2025

Uh oh!

mindeng commented Sep 21, 2025

Uh oh!

Uh oh!

mindeng commented Aug 29, 2025 •

edited

Loading

mindeng commented Aug 29, 2025 •

edited

Loading

mindeng commented Sep 4, 2025 •

edited

Loading