Regarding the release date of the video editing code #12

tayton42 · 2024-12-02T09:19:34Z

Thank you for your work! Do you plan to release the video editing code this week? If not, please let me know and I will try to modify it myself. If so, I will wait. Thank you!😘

wangjiangshan0725 · 2024-12-02T09:45:17Z

Dear @tayton42,

Thank you for your interest! We definitely plan to release this part of the code, but the exact timing will be slightly delayed, might depending on the outcome of the paper's acceptance. We appreciate your patience.

Btw, we are exploring using the more powerful video generation models (such as Mochi) for video editing, which is expected to outperform OpenSora.

tayton42 · 2024-12-02T09:52:25Z

Dear @tayton42,

Thank you for your interest! We definitely plan to release this part of the code, but the exact timing will be slightly delayed, might depending on the outcome of the paper's acceptance. We appreciate your patience.

Btw, we are exploring using the more powerful video generation models (such as Mochi) for video editing, which is expected to outperform OpenSora.

Understood, thank you for your reply, looking forward to your future work!

tayton42 · 2024-12-04T07:58:48Z

Dear @tayton42,

Thank you for your interest! We definitely plan to release this part of the code, but the exact timing will be slightly delayed, might depending on the outcome of the paper's acceptance. We appreciate your patience.

Btw, we are exploring using the more powerful video generation models (such as Mochi) for video editing, which is expected to outperform OpenSora.

Hi! I have tried to modify mochi's sampling code according to the formula, but I encountered difficulties when implementing the inversion. The result of the inversion deviates significantly from the original video. I am sure the problem is not with the VAE. Can you give me some advice? Below is the sampling code I modified based on mochi.

def sample_model_rfsolver(device, dit, conditioning, **args):
    random.seed(args["seed"])
    np.random.seed(args["seed"])
    torch.manual_seed(args["seed"])

    generator = torch.Generator(device=device)
    generator.manual_seed(args["seed"])

    w, h, t = args["width"], args["height"], args["num_frames"]
    sample_steps = args["num_inference_steps"]
    cfg_schedule = args["cfg_schedule"]
    sigma_schedule = args["sigma_schedule"]
    inversion = args["inversion"]
    if inversion:
        sigma_schedule=sigma_schedule[::-1]

    assert_eq(len(cfg_schedule), sample_steps, "cfg_schedule must have length sample_steps")
    assert_eq((t - 1) % 6, 0, "t - 1 must be divisible by 6")
    assert_eq(
        len(sigma_schedule),
        sample_steps + 1,
        "sigma_schedule must have length sample_steps + 1",
    )

    B = 1
    SPATIAL_DOWNSAMPLE = 8
    TEMPORAL_DOWNSAMPLE = 6
    IN_CHANNELS = 12
    latent_t = ((t - 1) // TEMPORAL_DOWNSAMPLE) + 1
    latent_w, latent_h = w // SPATIAL_DOWNSAMPLE, h // SPATIAL_DOWNSAMPLE

    # z = torch.randn(
    #     (B, IN_CHANNELS, latent_t, latent_h, latent_w),
    #     device=device,
    #     dtype=torch.float32,
    # )
    z=args["latent"]

    num_latents = latent_t * latent_h * latent_w
    cond_batched = cond_text = cond_null = None
    if "cond" in conditioning:
        cond_text = conditioning["cond"]
        cond_null = conditioning["null"]
        cond_text["packed_indices"] = compute_packed_indices(device, cond_text["y_mask"][0], num_latents)
        cond_null["packed_indices"] = compute_packed_indices(device, cond_null["y_mask"][0], num_latents)
    else:
        cond_batched = conditioning["batched"]
        cond_batched["packed_indices"] = compute_packed_indices(device, cond_batched["y_mask"][0], num_latents)
        z = repeat(z, "b ... -> (repeat b) ...", repeat=2)

    def model_fn(*, z, sigma, cfg_scale):
        if cond_batched:
            with torch.autocast("cuda", dtype=torch.bfloat16):
                out = dit(z, sigma, **cond_batched)
            out_cond, out_uncond = torch.chunk(out, chunks=2, dim=0)
        else:
            nonlocal cond_text, cond_null
            with torch.autocast("cuda", dtype=torch.bfloat16):
                if cfg_scale==0.0:
                    return dit(z, sigma, **cond_null).to(z)
                out_cond = dit(z, sigma, **cond_text)
                out_uncond = dit(z, sigma, **cond_null)
        assert out_cond.shape == out_uncond.shape
        out_uncond = out_uncond.to(z)
        out_cond = out_cond.to(z)
        return out_uncond + cfg_scale * (out_cond - out_uncond)

    # Euler sampler w/ customizable sigma schedule & cfg scale
    for i in get_new_progress_bar(range(0, sample_steps), desc="Sampling"):
        sigma = sigma_schedule[i]
        dsigma = sigma - sigma_schedule[i + 1]

        # `pred` estimates `z_0 - eps`.
        pred = model_fn(
            z=z,
            sigma=torch.full([B] if cond_text else [B * 2], sigma, device=z.device),
            cfg_scale=cfg_schedule[i],
        )

        z_mid=z + dsigma / 2 * pred
        pred_mid = model_fn(
            z=z_mid,
            sigma=torch.full([B] if cond_text else [B * 2], (sigma - dsigma / 2), device=z.device),
            cfg_scale=cfg_schedule[i],
        )
        #assert pred.dtype == torch.float32
        first_order=(pred_mid - pred) / (dsigma / 2)
        z = z + dsigma * pred + 0.5 * dsigma ** 2 * first_order

    z = z[:B] if cond_batched else z
    if inversion:
        return z
    return dit_latents_to_vae_latents(z)

tayton42 closed this as completed Dec 2, 2024

tayton42 reopened this Dec 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regarding the release date of the video editing code #12

Regarding the release date of the video editing code #12

tayton42 commented Dec 2, 2024

wangjiangshan0725 commented Dec 2, 2024

tayton42 commented Dec 2, 2024

tayton42 commented Dec 4, 2024

Regarding the release date of the video editing code #12

Regarding the release date of the video editing code #12

Comments

tayton42 commented Dec 2, 2024

wangjiangshan0725 commented Dec 2, 2024

tayton42 commented Dec 2, 2024

tayton42 commented Dec 4, 2024