Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows Errors: Running /training #81

Closed
1 of 2 tasks
oliverban opened this issue Nov 7, 2024 · 4 comments
Closed
1 of 2 tasks

Windows Errors: Running /training #81

oliverban opened this issue Nov 7, 2024 · 4 comments

Comments

@oliverban
Copy link

oliverban commented Nov 7, 2024

System Info / 系統信息

CUDA 12.4
Python 3.12
TORCH 2.5.1+cu124
2x3090

Information / 问题信息

  • The official example scripts / 官方的示例脚本
  • My own modified scripts / 我自己修改的脚本和任务

Is there any doc or anything like a step-guide for Windows training? I have installed all the reqs and everything but still getting error when I run training*.py, see below (cogfac is my conda environment):

(cogfac) C:\Users\Oliver\Documents\Github\cogvideox-factory>python training/cogvideox_text_to_video_lora.py
The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable.
Traceback (most recent call last):
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\transformers\utils\import_utils.py", line 1778, in _get_module
    return importlib.import_module("." + module_name, self.__name__)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\importlib\__init__.py", line 90, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen importlib._bootstrap>", line 1387, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1360, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1331, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 935, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 995, in exec_module
  File "<frozen importlib._bootstrap>", line 488, in _call_with_frames_removed
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\transformers\modeling_utils.py", line 59, in <module>
    from .quantizers import AutoHfQuantizer, HfQuantizer
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\transformers\quantizers\__init__.py", line 14, in <module>
    from .auto import AutoHfQuantizer, AutoQuantizationConfig
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\transformers\quantizers\auto.py", line 44, in <module>
    from .quantizer_torchao import TorchAoHfQuantizer
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\transformers\quantizers\quantizer_torchao.py", line 35, in <module>
    from torchao.quantization import quantize_
ImportError: cannot import name 'quantize_' from 'torchao.quantization' (C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\torchao\quantization\__init__.py). Did you mean: 'Quantizer'?

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\diffusers\utils\import_utils.py", line 853, in _get_module
    return importlib.import_module("." + module_name, self.__name__)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\importlib\__init__.py", line 90, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen importlib._bootstrap>", line 1387, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1360, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1331, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 935, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 995, in exec_module
  File "<frozen importlib._bootstrap>", line 488, in _call_with_frames_removed
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\diffusers\loaders\unet.py", line 46, in <module>
    from .lora_pipeline import LORA_WEIGHT_NAME, LORA_WEIGHT_NAME_SAFE, TEXT_ENCODER_NAME, UNET_NAME
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\diffusers\loaders\lora_pipeline.py", line 36, in <module>
    from .lora_base import LoraBaseMixin
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\diffusers\loaders\lora_base.py", line 44, in <module>
    from transformers import PreTrainedModel
  File "<frozen importlib._bootstrap>", line 1412, in _handle_fromlist
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\transformers\utils\import_utils.py", line 1766, in __getattr__
    module = self._get_module(self._class_to_module[name])
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\transformers\utils\import_utils.py", line 1780, in _get_module
    raise RuntimeError(
RuntimeError: Failed to import transformers.modeling_utils because of the following error (look up to see its traceback):
cannot import name 'quantize_' from 'torchao.quantization' (C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\torchao\quantization\__init__.py)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\diffusers\utils\import_utils.py", line 853, in _get_module
    return importlib.import_module("." + module_name, self.__name__)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\importlib\__init__.py", line 90, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen importlib._bootstrap>", line 1387, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1360, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1310, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 488, in _call_with_frames_removed
  File "<frozen importlib._bootstrap>", line 1387, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1360, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1331, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 935, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 995, in exec_module
  File "<frozen importlib._bootstrap>", line 488, in _call_with_frames_removed
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\diffusers\models\autoencoders\__init__.py", line 1, in <module>
    from .autoencoder_asym_kl import AsymmetricAutoencoderKL
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\diffusers\models\autoencoders\autoencoder_asym_kl.py", line 23, in <module>
    from .vae import DecoderOutput, DiagonalGaussianDistribution, Encoder, MaskConditionDecoder
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\diffusers\models\autoencoders\vae.py", line 25, in <module>
    from ..unets.unet_2d_blocks import (
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\diffusers\models\unets\__init__.py", line 6, in <module>
    from .unet_2d import UNet2DModel
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\diffusers\models\unets\unet_2d.py", line 24, in <module>
    from .unet_2d_blocks import UNetMidBlock2D, get_down_block, get_up_block
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\diffusers\models\unets\unet_2d_blocks.py", line 36, in <module>
    from ..transformers.dual_transformer_2d import DualTransformer2DModel
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\diffusers\models\transformers\__init__.py", line 13, in <module>
    from .prior_transformer import PriorTransformer
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\diffusers\models\transformers\prior_transformer.py", line 9, in <module>
    from ...loaders import PeftAdapterMixin, UNet2DConditionLoadersMixin
  File "<frozen importlib._bootstrap>", line 1412, in _handle_fromlist
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\diffusers\utils\import_utils.py", line 843, in __getattr__
    module = self._get_module(self._class_to_module[name])
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\diffusers\utils\import_utils.py", line 855, in _get_module
    raise RuntimeError(
RuntimeError: Failed to import diffusers.loaders.unet because of the following error (look up to see its traceback):
Failed to import transformers.modeling_utils because of the following error (look up to see its traceback):
cannot import name 'quantize_' from 'torchao.quantization' (C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\torchao\quantization\__init__.py)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\Users\Oliver\Documents\Github\cogvideox-factory\training\cogvideox_text_to_video_lora.py", line 37, in <module>
    from diffusers import (
  File "<frozen importlib._bootstrap>", line 1412, in _handle_fromlist
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\diffusers\utils\import_utils.py", line 844, in __getattr__
    value = getattr(module, name)
            ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\diffusers\utils\import_utils.py", line 843, in __getattr__
    module = self._get_module(self._class_to_module[name])
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\diffusers\utils\import_utils.py", line 855, in _get_module
    raise RuntimeError(
RuntimeError: Failed to import diffusers.models.autoencoders.autoencoder_kl_cogvideox because of the following error (look up to see its traceback):
Failed to import diffusers.loaders.unet because of the following error (look up to see its traceback):
Failed to import transformers.modeling_utils because of the following error (look up to see its traceback):
cannot import name 'quantize_' from 'torchao.quantization' (C:\Users\Oliver\MiniConda3\envs\cogfac\Lib\site-packages\torchao\quantization\__init__.py)
@oliverban oliverban changed the title Windows support but how to use? Windows Errors: Running /training Nov 7, 2024
@KaptainSisay
Copy link

Same here, but I can't install the requirements at all due to torchao 0.4.0 or newer aren't available for windows. Not sure if there's a workaround besides compiling torchao or running on Linux.

@a-r-r-o-w
Copy link
Owner

I am unsure about how to fix it, but this looks like an environment issue related to torchao installation. Could you try doing a clean install with USE_CPP=0 pip install --force-reinstall torchao?

I don't have a windows device to test on unfortunately but I've heard people have had success in doing so. If the latest torchao version does not work, could you try to install one of the older stable versions since we don't depend on some of their latest versions. Gentle ping to @Nojahhh in case he has encountered this since he's been super helpful with windows related issues

@toyxyz
Copy link

toyxyz commented Dec 1, 2024

pytorch/ao#957 (comment)

@sayakpaul
Copy link
Collaborator

We have added 8bit optimizer from bitsandbytes and deepspeed support in our new trainer API. Could you try that and let us know? Currently only LTX-Video and HunyuanVideo LoRA T2V fine-tuning is supported. CogVideoX T2V support is coming in #165. LTX-Video I2V support is coming in #150.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants