Convert VideoCodecParameters.extra_data to vector #323

sscobici · 2024-10-19T20:11:14Z

it is based on constants renaming commit from PR #322
This PR is to get early feedback, mkv changes are not finished, I will add another commit later
Notes:

used let mut VideoCodecParameters and write values directly into it, instead of creating many temp variables and then copy them to VideoCodecParameters. This is not following other code pattern, but with the growing number of variables can be more suitable/readable. Please let me know your opinion.
removed VisualCodecSpecific, initially this followed AudioCodecSpecific, but I'm not sure it is needed. VideoCodecParameters should have codec_id_tag (FourCC code) where VisualCodecSpecific variations can be stored.

sscobici · 2024-10-20T23:01:36Z

added mkv dolby vision configuration extra data parsing

sscobici · 2024-10-21T17:35:06Z

fixed mkv dovi parsing with two BlockAdditionMapping elements,
added DOVIDecoderConfigurationRecord parsing to common

pdeljanov · 2024-10-22T03:49:30Z

Did a quick skim of this and everything seems very reasonable to me!

used let mut VideoCodecParameters and write values directly into it, instead of creating many temp variables and then copy them to VideoCodecParameters. This is not following other code pattern, but with the growing number of variables can be more suitable/readable. Please let me know your opinion.

That's fine. I generally like immutable variables, but past a few it can become messy and writing into a mutable struct is cleaner.

Further thought: Do you think the builder methods should just be removed in general on the codec parameters structs? I believe a proper builder pattern would actually use a separate object all together.

removed VisualCodecSpecific, initially this followed AudioCodecSpecific, but I'm not sure it is needed. VideoCodecParameters should have codec_id_tag (FourCC code) where VisualCodecSpecific variations can be stored.

I'm not attached to Audio/VideoCodecSpecific, however, it would be preferable if the same pattern is used for both.

sscobici · 2024-10-22T21:52:17Z

Further thought: Do you think the builder methods should just be removed in general on the codec parameters structs? I believe a proper builder pattern would actually use a separate object all together.

I think the existing methods in the core codec parameters should be removed if we agree to adopt the struct initialization pattern. While a proper builder pattern does require a separate object, I’m not sure if you want to introduce the builder pattern in the library due to the potential runtime overhead.

I'm not attached to Audio/VideoCodecSpecific, however, it would be preferable if the same pattern is used for both.

Ok, let's try pattern alignment ;-). I've added two commits for your high-level review. These might be too many changes and potentially go against the contribution policy or introduce regressions, so feel free to reject them. I can then try to follow more closely to the existing code patterns in MP4.

Anyway, here are my reasons for the changes:

They align video and audio codec parameter creation in MP4.
They remove AudioCodecSpecific and treat AudioSampleEntry as a generic container for codec data, which is later transformed into AudioCodecParameters. I reviewed the MKV code, and it doesn't require additional enums to carry codec-specific info; instead, it uses TrackElement as a container. This aligns the approaches between MP4 and MKV.
At a high level, we now have:

format model (TrackElement/Audio(Video)Element for mkv and StsdAtom/Audio(Video)SampleEntry for mp4).
generic model (Audio(Video)CodecParameters
We should be able to create a trait to convert a format model into a generic model, which each format model can implement (see StsdAtom::make_codec_params).

It should be possible to create a trait to convert Atom data into SampleEntry, simplifying some of the MP4 code (see AlacAtom::fill_audio_sample_entry, FlacAtom::fill_audio_sample_entry, etc.).

This PR lacks the following:

Removal of new() and builder methods from core codec parameters.
moving make_codec_params into TrackElement class.
A trait for converting the format model to the generic model
A trait for filling audio/video sample entries in MP4

Please propose trait names if you liked the idea and want them now, or this can be done later.

sscobici · 2024-10-22T22:07:03Z

Forgot to mention that AudioSampleEntry.sample_rate was changed from f64 to u32. It is u32 in AudioCodecParameters, so not sure why it was created as f64.
How can I verify the changes with regression testing?
Any test file pack? Do you know how other libs do regression testing?

pdeljanov · 2024-10-24T00:06:23Z

Hi, thanks for your work!

Hopefully responding to everything below:

Further thought: Do you think the builder methods should just be removed in general on the codec parameters structs? I believe a proper builder pattern would actually use a separate object all together.

I think the existing methods in the core codec parameters should be removed if we agree to adopt the struct initialization pattern. While a proper builder pattern does require a separate object, I’m not sure if you want to introduce the builder pattern in the library due to the potential runtime overhead.

It doesn't seem particularly worth it to have a separate builder.

I recall the builder was introduced to force the user to set an AudioCodecId. However, nowadays, the entire struct can be be defaulted so maintaining the builder seems less relevant.

This pattern seems reasonable:

let params: VideoCodecParameters = Default::default();
params.codec = // CODEC_ID_ ...
// ...etc...

I'm not attached to Audio/VideoCodecSpecific, however, it would be preferable if the same pattern is used for both.

Ok, let's try pattern alignment ;-). I've added two commits for your high-level review. These might be too many changes and potentially go against the contribution policy or introduce regressions, so feel free to reject them. I can then try to follow more closely to the existing code patterns in MP4.

Anyway, here are my reasons for the changes:

They align video and audio codec parameter creation in MP4.

They remove AudioCodecSpecific and treat AudioSampleEntry as a generic container for codec data, which is later transformed into AudioCodecParameters. I reviewed the MKV code, and it doesn't require additional enums to carry codec-specific info; instead, it uses TrackElement as a container. This aligns the approaches between MP4 and MKV.

I'm not quite sure why AudioSampleEntry/VisualSampleEntry in MP4 just duplicate the contents of the AudioCodecParamters/VideoCodecParameters? Seems like at that point you'd just fill the codec parameters instead? I'm probably missing something.

At a high level, we now have:

format model (TrackElement/Audio(Video)Element for mkv and StsdAtom/Audio(Video)SampleEntry for mp4).

generic model (Audio(Video)CodecParameters

I like how this makes a solid distinction between what is read from the media source, and how we represent that in Symphonia. 👍

If the sample entry structs were intended to be the format-model object, then you can likely disregard my last comment. However, then the sample_rate should be an f64 since the container stores sample rate as a 16.16 fixed point number.

In this distinction it would make sense for the format-model objects to store their fields in the data type that most closely matched the data type in the file. That would help debugging, and adds value to what could be a lot of duplication. Then, the conversion would occur when converting into the generic-model.

We should be able to create a trait to convert a format model into a generic model, which each format model can implement (see StsdAtom::make_codec_params).

It should be possible to create a trait to convert Atom data into SampleEntry, simplifying some of the MP4 code (see AlacAtom::fill_audio_sample_entry, FlacAtom::fill_audio_sample_entry, etc.).

Why would we need a trait for this? Is there value in it?

This PR lacks the following:

Removal of new() and builder methods from core codec parameters.

moving make_codec_params into TrackElement class.

A trait for converting the format model to the generic model

A trait for filling audio/video sample entries in MP4

Please propose trait names if you liked the idea and want them now, or this can be done later.

We might be able to cross some of these off based on the discussion on the points above.

Forgot to mention that AudioSampleEntry.sample_rate was changed from f64 to u32. It is u32 in AudioCodecParameters, so not sure why it was created as f64. How can I verify the changes with regression testing? Any test file pack? Do you know how other libs do regression testing?

This should now be answered above!

On an administrative note, I think this entire PR can be reorganized. You can really squash everything down into 4 commits:

Update VideoCodecParameters in symphonia-core
Changes to symphonia-common
Changes to MP4
Changes to MKV

Also, let's not increase the scope of this PR any further than what we've discussed here so far. It is already going to be on the bigger side! Once we've reached a conclusion on all the points above, you can polish this PR off and I can then review and merge.

sscobici · 2024-10-24T03:24:19Z

Thanks for review.

It doesn't seem particularly worth it to have a separate builder.

I will not add more cleanups in this PR. Will make a separate PR.

However, then the sample_rate should be an f64 since the container stores sample rate as a 16.16 fixed point number.

Agree, I just observed that in some places it is read as u32, I will change it back to f64.

Why would we need a trait for this? Is there value in it?

I hope that it might remove some boiler plate code and encourage further formats to follow the common pattern of conversion. I haven't tried it to see if it really worth the benefits. But I agree not to introduce them now.

On an administrative note, I think this entire PR can be reorganized. You can really squash everything down into 4 commits:

This might make some commits to be non-buildable. I'm not sure this is a good, but if you insist I can do it.

sscobici · 2024-10-24T12:12:19Z

should be ready for detailed review now. Thanks for f64 note, the way I changed it to u32 was not correct.

pdeljanov · 2024-10-25T04:18:21Z

On an administrative note, I think this entire PR can be reorganized. You can really squash everything down into 4 commits:

This might make some commits to be non-buildable. I'm not sure this is a good, but if you insist I can do it.

Ah true, I guess you would have to make changes to adapt to extra_data now being a Vec. If you can fix that in 5-10 minutes then I think it's worthwhile for the cleaner history, however, if it would take longer than that, you can leave the commits as is. All commits being buildable is more important than a clean history.

I'll try to review this thoroughly in the next couple of days!

sscobici · 2024-10-25T19:33:29Z

rebased, kept Dolby Vision changes commits separate from the refactoring commits.

pdeljanov

Thanks again. This all looks very clean to me. I have noted a couple nit-picks for your consideration, and one potential hazard.

For testing, I generally use my media library. Unfortunately, since it's copyrighted, I can't share it. However, here's what I generally test:

AAC, ALAC, FLAC, PCM in MP4
- Test by passing file path to symphonia-play, but also by piping in via cat.
Movies
- Audio tracks should be playable. Other tracks detected.
Fragmented MP4
- These should also work when played by file path, or piped in.
- Easiest way to get a fragmented MP4 is from YouTube using yt-dlp. Pass -f 140 --fixup never to get a fragmented AAC audio file.

If these tests pass, it's generally good enough for me to merge.

When it comes to cutting a release, I run my entire library through symphonia-check and make sure there haven't been any regressions since the last release.

symphonia-common/src/mpeg/video/mod.rs

symphonia-format-isomp4/src/atoms/alac.rs

symphonia-format-isomp4/src/atoms/dovi.rs

… generic way

sscobici force-pushed the video-extra-data branch from 55b27cc to e44943f Compare October 20, 2024 22:20

sscobici marked this pull request as ready for review October 20, 2024 23:01

sscobici force-pushed the video-extra-data branch from e44943f to f75a276 Compare October 21, 2024 17:33

sscobici force-pushed the video-extra-data branch 3 times, most recently from 5474b40 to b5a3ae9 Compare October 22, 2024 20:57

sscobici force-pushed the video-extra-data branch from b5a3ae9 to 761d505 Compare October 24, 2024 12:06

sscobici force-pushed the video-extra-data branch from 761d505 to 92a3459 Compare October 24, 2024 14:24

sscobici force-pushed the video-extra-data branch from 92a3459 to 466109c Compare October 25, 2024 19:30

pdeljanov requested changes Oct 28, 2024

View reviewed changes

symphonia-common/src/mpeg/video/mod.rs Outdated Show resolved Hide resolved

symphonia-format-isomp4/src/atoms/alac.rs Outdated Show resolved Hide resolved

symphonia-format-isomp4/src/atoms/dovi.rs Show resolved Hide resolved

sscobici added 5 commits November 2, 2024 22:11

core, isomp4, mkv: collect many extra_data in VideoCodecParameters

5dee8ef

isomp4: use AudioSampleEntry to carry codec_specific information in a…

c7155be

… generic way

common: add Dolby Vision config parsing

330f838

mkv: parse Dolby Vision configuration

9c87c1f

isomp4: parse Dolby Vision configuration

168dbde

sscobici force-pushed the video-extra-data branch from 466109c to 168dbde Compare November 2, 2024 20:44

sscobici requested a review from pdeljanov November 2, 2024 20:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convert VideoCodecParameters.extra_data to vector #323

Convert VideoCodecParameters.extra_data to vector #323

sscobici commented Oct 19, 2024

sscobici commented Oct 20, 2024

sscobici commented Oct 21, 2024 •

edited

Loading

pdeljanov commented Oct 22, 2024

sscobici commented Oct 22, 2024

sscobici commented Oct 22, 2024

pdeljanov commented Oct 24, 2024

sscobici commented Oct 24, 2024

sscobici commented Oct 24, 2024

pdeljanov commented Oct 25, 2024

sscobici commented Oct 25, 2024

pdeljanov left a comment

Convert VideoCodecParameters.extra_data to vector #323

Are you sure you want to change the base?

Convert VideoCodecParameters.extra_data to vector #323

Conversation

sscobici commented Oct 19, 2024

sscobici commented Oct 20, 2024

sscobici commented Oct 21, 2024 • edited Loading

pdeljanov commented Oct 22, 2024

sscobici commented Oct 22, 2024

sscobici commented Oct 22, 2024

pdeljanov commented Oct 24, 2024

sscobici commented Oct 24, 2024

sscobici commented Oct 24, 2024

pdeljanov commented Oct 25, 2024

sscobici commented Oct 25, 2024

pdeljanov left a comment

Choose a reason for hiding this comment

sscobici commented Oct 21, 2024 •

edited

Loading