-
Notifications
You must be signed in to change notification settings - Fork 189
[Feature] Add Cosmos2 i2v pipeline #837
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| _class_name: str = "AutoencoderKLWan" | ||
| _diffusers_version: str = "0.34.0.dev0" | ||
| _name_or_path: str = "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These fields should be pop/removed in the loader so can be removed. You can refer to how wan's vae config is defined
|
|
||
| def __post_init__(self): | ||
| self.blend_num_frames = (self.tile_sample_min_num_frames - | ||
| self.tile_sample_stride_num_frames) * 2 No newline at end of file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Newline char
|
|
||
|
|
||
| @dataclass | ||
| class CosmosVideoConfigFixed(CosmosVideoConfig): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this needed?
| class CosmosVideoConfigFixed(CosmosVideoConfig): | ||
| """Fixed Cosmos Video Config that matches original Cosmos2 Video2World configuration.""" | ||
|
|
||
| def update_model_arch(self, config: dict) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you align against diffusers or original cosmos?
fastvideo/configs/sample/cosmos.py
Outdated
|
|
||
|
|
||
| @dataclass | ||
| class CosmosTeaCacheParams(CacheParams): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be removed for now, but you should add a default for cosmos sampling params. Refer to wan
fastvideo/layers/layernorm.py
Outdated
| if self.has_weight: | ||
| self.weight = nn.Parameter(self.weight) | ||
|
|
||
| def forward(self, hidden_states: torch.Tensor) -> torch.Tensor: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you change this to forward_diffusers?
ca83c7e to
8b89aff
Compare
| return x.flatten(-2) | ||
|
|
||
|
|
||
| def apply_rotary_emb( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should consider making this cosmos2 specific?
| return imgs | ||
|
|
||
|
|
||
| def get_timestep_embedding( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe also make the architecture specific
| bs, seq_len, _ = hidden_states.shape | ||
| num_seqs = bs | ||
| n, c = self.n_heads, self.d_model // self.total_num_heads | ||
| #n, c = self.n_heads, self.d_model // self.total_num_heads |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
delete this
| qkv, _ = self.qkv_proj(hidden_states) | ||
| # Projection of 'own' hidden state (self-attention). No GQA here. | ||
| q, k, v = qkv.split(self.inner_dim, dim=-1) | ||
| #q, k, v = qkv.split(self.inner_dim, dim=-1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
delete this
| timesteps_array = np.linspace(self._sigma_to_t(self.sigma_max), | ||
| self._sigma_to_t(self.sigma_min), | ||
| num_inference_steps) | ||
| t_max = self._sigma_to_t(self.sigma_max) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe make a separate version of this scheduler
| # Peiyuan: using GPU seed will cause A100 and H100 to generate different results... | ||
| batch.generator = [ | ||
| torch.Generator("cpu").manual_seed(seed) for seed in seeds | ||
| torch.Generator(device="cpu").manual_seed(seed) for seed in seeds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
revert please
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe also separate these
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make arch specific
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test both
| self.world_size = self.fastvideo_args.num_gpus | ||
| self.shutting_down = False | ||
|
|
||
| # Initialize CUDA before setting up multiprocessing to ensure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove
275f0b2 to
1b7bfe5
Compare
No description provided.