-
Notifications
You must be signed in to change notification settings - Fork 11
Description
Hello, I have successfully configured the environment and run the inference processes of steps 2 and 3 in step 3, but I encountered the following error when running step 1 of step 3. Could you please help me resolve this? Thank you.
(camclone) root@HOST:/PATH/TO/CamCloneMaster# python inference_i2v.py --dataset_path demo/example_csv/infer/example_i2v_testset.csv --ckpt_path models/CamCloneMaster-Wan2.1/Wan-I2V-1.3B-Step8000.ckpt --output_dir demo/i2v_output
Loading models from: ./models/Wan-AI/Wan2.1-T2V-1.3B/diffusion_pytorch_model.safetensors
model_name: wan_video_dit model_class: WanModel
This model is initialized with extra kwargs: {'has_image_input': False, 'patch_size': [1, 2, 2], 'in_dim': 16, 'dim': 1536, 'ffn_dim': 8960, 'freq_dim': 256, 'text_dim': 4096, 'out_dim': 16, 'num_heads': 12, 'num_layers': 30, 'eps': 1e-06}
The following models are loaded: ['wan_video_dit'].
Loading models from: ./models/Wan-AI/Wan2.1-T2V-1.3B/models_clip_open-clip-xlm-roberta-large-vit-huge-14.pth
model_name: wan_video_image_encoder model_class: WanImageEncoder
The following models are loaded: ['wan_video_image_encoder'].
Loading models from: ./models/Wan-AI/Wan2.1-T2V-1.3B/models_t5_umt5-xxl-enc-bf16.pth
model_name: wan_video_text_encoder model_class: WanTextEncoder
The following models are loaded: ['wan_video_text_encoder'].
Loading models from: ./models/Wan-AI/Wan2.1-T2V-1.3B/Wan2.1_VAE.pth
model_name: wan_video_vae model_class: WanVideoVAE
The following models are loaded: ['wan_video_vae'].
Using wan_video_text_encoder from ./models/Wan-AI/Wan2.1-T2V-1.3B/models_t5_umt5-xxl-enc-bf16.pth.
Using wan_video_dit from ./models/Wan-AI/Wan2.1-T2V-1.3B/diffusion_pytorch_model.safetensors.
Using wan_video_vae from ./models/Wan-AI/Wan2.1-T2V-1.3B/Wan2.1_VAE.pth.
Using wan_video_image_encoder from ./models/Wan-AI/Wan2.1-T2V-1.3B/models_clip_open-clip-xlm-roberta-large-vit-huge-14.pth.
/PATH/TO/conda_envs/camclone/lib/python3.10/site-packages/torchvision/transforms/v2/_deprecated.py:42: UserWarning: The transform ToTensor() is deprecated and will be removed in a future release. Instead, please use v2.Compose([v2.ToImage(), v2.ToDtype(torch.float32, scale=True)]). Output is equivalent up to float precision.
warnings.warn(
Traceback (most recent call last):
File "/PATH/TO/CamCloneMaster/inference_i2v.py", line 231, in
video = pipe(
File "/PATH/TO/conda_envs/camclone/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
return func(*args, **kwargs)
File "/PATH/TO/CamCloneMaster/diffsynth/pipelines/wan_video_i2v.py", line 272, in call
content_latents = torch.zeros_like(ref_latents)
TypeError: zeros_like(): argument 'input' (position 1) must be Tensor, not NoneType
I used AI to erase the personal information from the logs.