I would like to know about the parameters during training. #131

ootsuka-repos · 2024-12-21T15:05:16Z

About video_resolution_buckets.
It defaults to 49x512x768, but I think the order is frames, height, width.
Is there an upper limit for frames, height, and width?

https://huggingface.co/Lightricks/LTX-Video

The LTX formula says num_frames=161,fps=24.
In this case, frames is up to 161, but I think I could specify fps when using factory, but internally, 1 second is trained as 24 fps?

I know that each model has a different input dimension.

It's a long story, but I only have one question.
How do I know how far I can set the value of video_resolution_buckets as an upper limit in training HunyuanVideo or LTX-VIDEO?

Translated with DeepL.com (free version)

sayakpaul · 2024-12-23T14:26:36Z

I don't think we have a validation method to check for that. Perhaps, we could add some guidelines about how to set that parameter in the README. Would that help? Cc: @a-r-r-o-w

a-r-r-o-w · 2024-12-23T14:37:26Z

Is there an upper limit for frames, height, and width?

Yes, it is the same as the original repository: 257 frames, 720 height, 1280 width. But Diffusers does not yet have framewise encoding/decoding supported in the VAE, so I doubt if one could reach upto 257 frames for finetuning even on an 80GB card. I will look into this soon.

In this case, frames is up to 161, but I think I could specify fps when using factory, but internally, 1 second is trained as 24 fps?

We don't actually support training with the frame_rate parameter yet. I've hardcoded this to 24 for the moment. It should be easy enough to support, but it would help to have a training run where this works as expected and doesn't cause terrible results. I will try and work on it this weekend.

How do I know how far I can set the value of video_resolution_buckets as an upper limit in training HunyuanVideo or LTX-VIDEO?

Whatever the upper limit is specified in the original repository, the same is supported here because we rely on the Diffusers implementations (which are exact matches numerically to the original code). However, we are still working on optimizing memory requirements so it will take some more time before higher resolution training is possible.

ootsuka-repos · 2024-12-23T15:13:12Z

Thank you. I roughly understand.
As sayakpaul mentioned, video_resolution_buckets can be complex, so I think having user guidelines would make it easier to understand.

sayakpaul · 2025-01-02T14:17:54Z

Thanks @ootsuka-repos. Would you maybe like to help us with a PR?

ootsuka-repos · 2025-01-04T10:21:06Z

OK @sayakpaul A PR with documentation maintenance will be issued later.

ootsuka-repos · 2025-01-04T11:28:39Z

@sayakpaul @a-r-r-o-w

PR maked
#181

Added parameter documentation.
Please let me know if there are any mistakes due to tentative maintenance.

Also, we are going through AI as we do not know native English. If you can fix any nuances or parts that may cause differences in interpretation, please let me know.

Please let me know if there is anything else that needs to be done outside of the core implementation.
We will respond in our available time.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

I would like to know about the parameters during training. #131

I would like to know about the parameters during training. #131

ootsuka-repos commented Dec 21, 2024

sayakpaul commented Dec 23, 2024

a-r-r-o-w commented Dec 23, 2024 •

edited

Loading

ootsuka-repos commented Dec 23, 2024

sayakpaul commented Jan 2, 2025

ootsuka-repos commented Jan 4, 2025

ootsuka-repos commented Jan 4, 2025

I would like to know about the parameters during training. #131

I would like to know about the parameters during training. #131

Comments

ootsuka-repos commented Dec 21, 2024

sayakpaul commented Dec 23, 2024

a-r-r-o-w commented Dec 23, 2024 • edited Loading

ootsuka-repos commented Dec 23, 2024

sayakpaul commented Jan 2, 2025

ootsuka-repos commented Jan 4, 2025

ootsuka-repos commented Jan 4, 2025

a-r-r-o-w commented Dec 23, 2024 •

edited

Loading