Windows Errors: Running /training #81

oliverban · 2024-11-07T19:31:23Z

System Info / 系統信息

CUDA 12.4
Python 3.12
TORCH 2.5.1+cu124
2x3090

Information / 问题信息

The official example scripts / 官方的示例脚本
My own modified scripts / 我自己修改的脚本和任务

Is there any doc or anything like a step-guide for Windows training? I have installed all the reqs and everything but still getting error when I run training*.py, see below (cogfac is my conda environment):

(cogfac) C:\Users\Oliver\Documents\Github\cogvideox-factory>python training/cogvideox_text_to_video_lora.py
The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable.
Traceback (most recent call last):
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\transformers\utils\import_utils.py", line 1778, in _get_module
    return importlib.import_module("." + module_name, self.__name__)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\importlib\__init__.py", line 90, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen importlib._bootstrap>", line 1387, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1360, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1331, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 935, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 995, in exec_module
  File "<frozen importlib._bootstrap>", line 488, in _call_with_frames_removed
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\transformers\modeling_utils.py", line 59, in <module>
    from .quantizers import AutoHfQuantizer, HfQuantizer
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\transformers\quantizers\__init__.py", line 14, in <module>
    from .auto import AutoHfQuantizer, AutoQuantizationConfig
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\transformers\quantizers\auto.py", line 44, in <module>
    from .quantizer_torchao import TorchAoHfQuantizer
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\transformers\quantizers\quantizer_torchao.py", line 35, in <module>
    from torchao.quantization import quantize_
ImportError: cannot import name 'quantize_' from 'torchao.quantization' (C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\torchao\quantization\__init__.py). Did you mean: 'Quantizer'?

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\diffusers\utils\import_utils.py", line 853, in _get_module
    return importlib.import_module("." + module_name, self.__name__)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\importlib\__init__.py", line 90, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen importlib._bootstrap>", line 1387, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1360, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1331, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 935, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 995, in exec_module
  File "<frozen importlib._bootstrap>", line 488, in _call_with_frames_removed
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\diffusers\loaders\unet.py", line 46, in <module>
    from .lora_pipeline import LORA_WEIGHT_NAME, LORA_WEIGHT_NAME_SAFE, TEXT_ENCODER_NAME, UNET_NAME
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\diffusers\loaders\lora_pipeline.py", line 36, in <module>
    from .lora_base import LoraBaseMixin
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\diffusers\loaders\lora_base.py", line 44, in <module>
    from transformers import PreTrainedModel
  File "<frozen importlib._bootstrap>", line 1412, in _handle_fromlist
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\transformers\utils\import_utils.py", line 1766, in __getattr__
    module = self._get_module(self._class_to_module[name])
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\transformers\utils\import_utils.py", line 1780, in _get_module
    raise RuntimeError(
RuntimeError: Failed to import transformers.modeling_utils because of the following error (look up to see its traceback):
cannot import name 'quantize_' from 'torchao.quantization' (C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\torchao\quantization\__init__.py)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\diffusers\utils\import_utils.py", line 853, in _get_module
    return importlib.import_module("." + module_name, self.__name__)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\importlib\__init__.py", line 90, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen importlib._bootstrap>", line 1387, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1360, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1310, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 488, in _call_with_frames_removed
  File "<frozen importlib._bootstrap>", line 1387, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1360, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1331, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 935, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 995, in exec_module
  File "<frozen importlib._bootstrap>", line 488, in _call_with_frames_removed
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\diffusers\models\autoencoders\__init__.py", line 1, in <module>
    from .autoencoder_asym_kl import AsymmetricAutoencoderKL
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\diffusers\models\autoencoders\autoencoder_asym_kl.py", line 23, in <module>
    from .vae import DecoderOutput, DiagonalGaussianDistribution, Encoder, MaskConditionDecoder
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\diffusers\models\autoencoders\vae.py", line 25, in <module>
    from ..unets.unet_2d_blocks import (
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\diffusers\models\unets\__init__.py", line 6, in <module>
    from .unet_2d import UNet2DModel
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\diffusers\models\unets\unet_2d.py", line 24, in <module>
    from .unet_2d_blocks import UNetMidBlock2D, get_down_block, get_up_block
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\diffusers\models\unets\unet_2d_blocks.py", line 36, in <module>
    from ..transformers.dual_transformer_2d import DualTransformer2DModel
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\diffusers\models\transformers\__init__.py", line 13, in <module>
    from .prior_transformer import PriorTransformer
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\diffusers\models\transformers\prior_transformer.py", line 9, in <module>
    from ...loaders import PeftAdapterMixin, UNet2DConditionLoadersMixin
  File "<frozen importlib._bootstrap>", line 1412, in _handle_fromlist
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\diffusers\utils\import_utils.py", line 843, in __getattr__
    module = self._get_module(self._class_to_module[name])
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\diffusers\utils\import_utils.py", line 855, in _get_module
    raise RuntimeError(
RuntimeError: Failed to import diffusers.loaders.unet because of the following error (look up to see its traceback):
Failed to import transformers.modeling_utils because of the following error (look up to see its traceback):
cannot import name 'quantize_' from 'torchao.quantization' (C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\torchao\quantization\__init__.py)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\Users\Oliver\Documents\Github\cogvideox-factory\training\cogvideox_text_to_video_lora.py", line 37, in <module>
    from diffusers import (
  File "<frozen importlib._bootstrap>", line 1412, in _handle_fromlist
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\diffusers\utils\import_utils.py", line 844, in __getattr__
    value = getattr(module, name)
            ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\diffusers\utils\import_utils.py", line 843, in __getattr__
    module = self._get_module(self._class_to_module[name])
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\diffusers\utils\import_utils.py", line 855, in _get_module
    raise RuntimeError(
RuntimeError: Failed to import diffusers.models.autoencoders.autoencoder_kl_cogvideox because of the following error (look up to see its traceback):
Failed to import diffusers.loaders.unet because of the following error (look up to see its traceback):
Failed to import transformers.modeling_utils because of the following error (look up to see its traceback):
cannot import name 'quantize_' from 'torchao.quantization' (C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\torchao\quantization\__init__.py)

The text was updated successfully, but these errors were encountered:

KaptainSisay · 2024-11-08T22:45:23Z

Same here, but I can't install the requirements at all due to torchao 0.4.0 or newer aren't available for windows. Not sure if there's a workaround besides compiling torchao or running on Linux.

a-r-r-o-w · 2024-11-08T22:51:44Z

I am unsure about how to fix it, but this looks like an environment issue related to torchao installation. Could you try doing a clean install with USE_CPP=0 pip install --force-reinstall torchao?

I don't have a windows device to test on unfortunately but I've heard people have had success in doing so. If the latest torchao version does not work, could you try to install one of the older stable versions since we don't depend on some of their latest versions. Gentle ping to @Nojahhh in case he has encountered this since he's been super helpful with windows related issues

toyxyz · 2024-12-01T07:47:29Z

pytorch/ao#957 (comment)

sayakpaul · 2025-01-02T14:35:59Z

We have added 8bit optimizer from bitsandbytes and deepspeed support in our new trainer API. Could you try that and let us know? Currently only LTX-Video and HunyuanVideo LoRA T2V fine-tuning is supported. CogVideoX T2V support is coming in #165. LTX-Video I2V support is coming in #150.

oliverban changed the title ~~Windows support but how to use?~~ Windows Errors: Running /training Nov 7, 2024

a-r-r-o-w closed this as completed Jan 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Windows Errors: Running /training #81

Windows Errors: Running /training #81

oliverban commented Nov 7, 2024 •

edited

Loading

KaptainSisay commented Nov 8, 2024

a-r-r-o-w commented Nov 8, 2024

toyxyz commented Dec 1, 2024

sayakpaul commented Jan 2, 2025

Windows Errors: Running /training #81

Windows Errors: Running /training #81

Comments

oliverban commented Nov 7, 2024 • edited Loading

System Info / 系統信息

Information / 问题信息

KaptainSisay commented Nov 8, 2024

a-r-r-o-w commented Nov 8, 2024

toyxyz commented Dec 1, 2024

sayakpaul commented Jan 2, 2025

oliverban commented Nov 7, 2024 •

edited

Loading