Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add mixed precision training support for cyclegan turbo #30

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

King-HAW
Copy link

Hi Gaurav,

I've added the mixed precision support for training cyclegan turbo, so that the unpaired training could work on a 24G NVIDIA GPU.

@seerdecker
Copy link

Tried to run this, it fails.

Loading model from: /home/ubuntu/miniconda3/envs/img2img-turbo/lib/python3.10/site-packages/lpips/weights/v0.1/vgg.pth
Steps:   0%|                                                                                                                                                                      | 0/25000 [00:00<?, ?it/s]Traceback (most recent call last):
  File "/home/ubuntu/repos/img2img-turbo/src/train_cyclegan_turbo.py", line 410, in <module>
    main(args)
  File "/home/ubuntu/repos/img2img-turbo/src/train_cyclegan_turbo.py", line 213, in main
    accelerator.clip_grad_norm_(params_gen, args.max_grad_norm)
  File "/home/ubuntu/miniconda3/envs/img2img-turbo/lib/python3.10/site-packages/accelerate/accelerator.py", line 2157, in clip_grad_norm_
    self.unscale_gradients()
  File "/home/ubuntu/miniconda3/envs/img2img-turbo/lib/python3.10/site-packages/accelerate/accelerator.py", line 2107, in unscale_gradients
    self.scaler.unscale_(opt)
  File "/home/ubuntu/miniconda3/envs/img2img-turbo/lib/python3.10/site-packages/torch/cuda/amp/grad_scaler.py", line 284, in unscale_
    optimizer_state["found_inf_per_device"] = self._unscale_grads_(optimizer, inv_scale, found_inf, False)
  File "/home/ubuntu/miniconda3/envs/img2img-turbo/lib/python3.10/site-packages/torch/cuda/amp/grad_scaler.py", line 212, in _unscale_grads_
    raise ValueError("Attempting to unscale FP16 gradients.")
ValueError: Attempting to unscale FP16 gradients.

@King-HAW
Copy link
Author

Tried to run this, it fails.

Loading model from: /home/ubuntu/miniconda3/envs/img2img-turbo/lib/python3.10/site-packages/lpips/weights/v0.1/vgg.pth
Steps:   0%|                                                                                                                                                                      | 0/25000 [00:00<?, ?it/s]Traceback (most recent call last):
  File "/home/ubuntu/repos/img2img-turbo/src/train_cyclegan_turbo.py", line 410, in <module>
    main(args)
  File "/home/ubuntu/repos/img2img-turbo/src/train_cyclegan_turbo.py", line 213, in main
    accelerator.clip_grad_norm_(params_gen, args.max_grad_norm)
  File "/home/ubuntu/miniconda3/envs/img2img-turbo/lib/python3.10/site-packages/accelerate/accelerator.py", line 2157, in clip_grad_norm_
    self.unscale_gradients()
  File "/home/ubuntu/miniconda3/envs/img2img-turbo/lib/python3.10/site-packages/accelerate/accelerator.py", line 2107, in unscale_gradients
    self.scaler.unscale_(opt)
  File "/home/ubuntu/miniconda3/envs/img2img-turbo/lib/python3.10/site-packages/torch/cuda/amp/grad_scaler.py", line 284, in unscale_
    optimizer_state["found_inf_per_device"] = self._unscale_grads_(optimizer, inv_scale, found_inf, False)
  File "/home/ubuntu/miniconda3/envs/img2img-turbo/lib/python3.10/site-packages/torch/cuda/amp/grad_scaler.py", line 212, in _unscale_grads_
    raise ValueError("Attempting to unscale FP16 gradients.")
ValueError: Attempting to unscale FP16 gradients.

Hi, please try to set the mixed precision to bf16, that should work. My local GPU is NVIDIA GeForce RTX 4090 24GB.

@ACupoFruiTea
Copy link

ACupoFruiTea commented Jul 15, 2024

@King-HAW Hi , thanks for sharing
I meet a problem as follows when i use the mixed precision

ValueError: Query/Key/Value should either all have the same dtype, or (in the quantized case) Key/Value should have dtype torch.int32

query.dtype: torch.float32
key.dtype : torch.bfloat16
value.dtype: torch.bfloat16

But I solve this problem when i I run accelerate without --enable_xformers_memory_efficient_attention by following https://github.com/huggingface/accelerate/issues/2182
Do you meet the same problem before , and how do you solve this problem

@nldhuyen0047
Copy link

nldhuyen0047 commented Aug 28, 2024

Hi @King-HAW, I have tried your fork, but the out of memory has still maintained (I use 3090 with 24GB VRAM). Could you please explain to me how can I fix this error?

Thank you so much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants